Fast Facts#
System Summary#
- Hardware: Dell R720xd
- CPUs: 2× Intel Xeon E5-2650 v2 (32 threads)
- RAM: 168GB DDR3
- Storage:
- Boot: 2×250GB SSD (RAID1)
- Data: 4×3TB HDD (RAID5 + hot spare)
- GPUs:
- NVIDIA Quadro P400
- 2× NVIDIA Tesla K80 (Kepler GK210, CUDA 3.7 — host-bound, currently disabled)
- Networking:
- 4×1GbE (Bonded as
bond1→vmbr0) - 2×10GbE (Bonded as
bond0→vmbr1) - Management: iDRAC at
192.168.0.21
The host config lives at hosts/x86_64-nixos/ereshkigal/default.nix.
GPUs#
The NVIDIA stack is driven on-host (no VFIO passthrough; there are no microvm
guests defined — microvm.autostart = []). NVIDIA support is enabled with the
legacy_470 driver (tsunaminoai.nvidia.package =
config.boot.kernelPackages.nvidiaPackages.legacy_470).
The Tesla K80 is Kepler-generation (GK210, compute capability 3.7) and is
incompatible with CUDA 12, so it is kept disabled for CUDA work.
hardware.nvidia-container-toolkit.enable = false. The K80 does not support
MIG/SR-IOV partitioning (that is an Ampere-era feature).
Verification Steps#
-
GPU Functionality Test
nvidia-smi # Should show the installed GPUs -
Network Bonding Verification
cat /proc/net/bonding/bond0 # Check 10GbE bond (enp70s0f0/f1 → vmbr1) cat /proc/net/bonding/bond1 # Check 1GbE bond (eno1..eno4 → vmbr0) -
Network Throughput Test
iperf3 -c 10.0.0.2 -P 8 # Test 10GbE backbone to the voile peer -
Storage Health Check
/opt/MegaRAID/perccli/perccli64 /c0 show all # Verify RAID status btrfs scrub start /nix # Check the btrfs filesystem -
Boot Time Benchmark
systemd-analyze blame # Identify slow services
Note: After changes, rebuild with
nixos-rebuild testbefore applying permanently. Monitordmesgfor hardware errors during boot.
Services#
Ereshkigal is the primary services host on the gensokyo LAN. The roles enabled
in its config (tsunaminoai.* unless otherwise noted) are:
| Service | Role | Endpoint / Notes |
|---|---|---|
| step-ca | Internal PKI / ACME CA (pki.acme.enable = true) |
Issues certs to LAN hosts via ACME |
| Paperless-NGX | Document pipeline (docPipeline.enable = true) |
Web UI on :8011 (HTTPS :8012); OCR + auto-tagging, inference via Ollama on mokou over Tailscale |
| Open-WebUI | LLM chat + RAG over Paperless docs (openWebui.enable = true) |
HTTP :3000, HTTPS :3001; talks to Ollama on mokou |
| Media / servarr | media.server.video/audio + servarr.enable = true |
Jellyfin, Sonarr, Radarr, Prowlarr, Lidarr, Readarr, Whisparr, qBittorrent, Stash; Tdarr server |
| Homer | Dashboard (homer.enable = true) |
:88 |
| nix-serve | Binary cache (nix.isCache = true) |
:11111 (also a deploy node, nix.isDeployNode = true) |
| Ollama | Local inference (services.ollama) |
:11434, CUDA-accelerated; loads nomic-embed-text for Open-WebUI RAG |
| ESPHome | IoT device firmware/dashboard (esphome.enable = true) |
IoT VLAN gateway 192.168.0.1 |
| Deploy dashboard | Fleet health (deploy.enableDashboard = true) |
:8420, healthCheckInterval = "5min" |
| dell-idrac-fan-controller | Managed fan curve (services.dell-idrac-fan-controller) |
fanSpeed = 10, iDRAC local, password from sops dell-fan |
Fleet rebuild monitoring is handled by the deploy dashboard above
(tsunaminoai.deploy.enableDashboard = true, dashboardPort = 8420,
healthCheckInterval = "5min"), which watches the monitoredHosts list
(ereshkigal, mokou, razer, shinobu, bedford-drdillo-mbair,
work-laptop).
Note: Fan management is already handled by
services.dell-idrac-fan-controller; do not also set manual GPU fan curves in iDRAC.
Description#
Ereshkigal is a Dell R720xd with the following specs:
| Component | Spec | Notes |
|---|---|---|
| CPU | 2x Intel(R) Xeon(R) CPU E5-2650 v2 @ 3.40 GHz | 32 vcores |
| RAM | 168 GB | |
| Storage | 2x 250GB SSD RAID 1 | RootFS |
| Storage | 4x 3TB RAID 5 (1 hot spare) | Kur storage pool |
Networking#
There are two network devices available on ereshkigal: a 4x1gbps integrated card and a 2x10gbps-T card in riser 1.
iDrac#
The iDrac is a hardware management system of Ereshkigal where power, remote console, and RAID can be configured independent of the power status of the main server. iDrac Access (use Safari)
Storage#
Storage can be initially managed through iDrac under storage. However, while the system is on, any management on running must be managed in the host OS. For that reason, if a disk for the RootOS (RAID-1) or the proxmox storage pool (RAID-5) must be replaced, do the following:
- Identify the failing drive using syslog, SMART, iDrac identify “blink”, etc
- Procure a new disk that is the same size as other member disks
- Remove the identified disk from the machine
- Insert the replacement disk
- Verify the disk shows in iDrac under storage → Physical Disks
- Log into ereshkigal using ssh
- Run:
/opt/MegaRAID/perccli/perccli64 /c0/sX add hotsparedrivewhere X is the disk slot number to add a global hot spare- or
/opt/MegaRAID/perccli/perccli64 /c0/sX add hotsparedrive dgs=Ywhere Y is the disk group (VD) number to add a dedicated hotspare
- Check that the rebuild is running using
/opt/MegaRAID/perccli/perccli64 /c0/sX show rebuild - When the rebuild completes, verify that the RAID is back in healthy status
GPU Inventory#
The host carries an NVIDIA Quadro P400 plus two Tesla K80s. As of the current
config these are driven on-host with the legacy_470 driver and are not
passed through to any VM (no microvm guests are defined). The K80s remain
disabled for CUDA work — see the GPUs note above.
Relevant lspci lines
04:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P400] (rev a1)
04:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev a1)
44:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
45:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
Dell iDRAC SSH Shell Command Cheatsheet#
This cheatsheet summarizes the core commands available in the iDRAC shell (CLP/SMCLP) when connected via SSH on a Dell PowerEdge R720xd. These commands are distinct from racadm but provide similar management functionality directly in the shell.
Core Shell Commands#
| Command | Description & Usage Example |
|---|---|
show |
Display information about system objects or properties.Usage: show [] |
set |
Set a property value.Usage: set [] = |
cd |
Change current directory/object context.Usage: cd [] |
create |
Create a new object.Usage: create [=] ... |
delete |
Delete an object.Usage: delete |
exit |
Exit the shell session.Usage: exit |
reset |
Reset a target (e.g., power cycle server or controller).Usage: reset [] |
start |
Start a target (e.g., power on server/component).Usage: start [] |
stop |
Stop a target (e.g., power off server/component).Usage: stop [] |
version |
Show shell and firmware version info.Usage: version |
help |
Show help for commands or topics.Usage: help [] |
load |
Load configuration from a URI.Usage: load -source [] |
dump |
Dump configuration to a URI.Usage: dump -destination [] |
Usage Examples#
- Show all system info:
show / - Show a specific property:
show /system1 - Set a property:
set /system1 enabled=true - Change directory/context:
cd /system1 - Reset the server:
reset /system1 - Power on the server:
start /system1 - Power off the server:
stop /system1 - Create a new user (example, if supported):
create /user1 username=admin password=secret - Delete a user (example):
delete /user1 - Load config from a file:
load -source tftp://192.168.1.10/config.xml /system1 - Dump config to a file:
dump -destination tftp://192.168.1.10/backup.xml /system1 - Get shell version:
version - Get help on a command:
help set
Command Structure#
- Targets are objects in the system hierarchy (e.g.,
/system1,/chassis1,/user1). - Properties are attributes of those targets (e.g.,
enabled,username). - Options may modify command behavior (e.g.,
-source,-destination).
Notes#
- The shell is case-sensitive.
- Use
cd /to return to the root context. - Use
showwithout arguments for a list of objects in the current context. - For full command syntax and target/property names, use
helporshowat each level.
This cheatsheet covers the main command set available in the iDRAC shell via SSH on Dell PowerEdge servers like the R720xd, based on the /admin1-> help output and standard SMCLP conventions. For advanced features, consult the iDRAC CLI Reference Guide or use help within the shell for context-sensitive assistance.
[1] https://www.dell.com/support/contents/en-us/videos/videoplayer/tutorial-on-idrac-racadm-command-line/1706695616981987241 [2] https://www.gooksu.com/2015/04/racadm-quick-dirty-cheatsheet/ [3] https://dl.dell.com/topicspdf/idrac7-8-lifecycle-controller-v2505050_reference-guide_en-us.pdf [4] https://github.com/spyroot/idrac_ctl [5] https://christitus.com/idrac-dell-server/ [6] https://dl.dell.com/topicspdf/idrac7-8-lifecycle-controller-v2404040_reference-guide_en-us.pdf [7] https://dl.dell.com/topicspdf/idrac7-8-lifecycle-controller-v2.30.30.30_reference-guide4_en-us.pdf [8] https://www.reddit.com/r/homelab/comments/a49b4y/r720_idrac_help/ [9] https://www.dell.com/support/manuals/en-us/idrac9-lifecycle-controller-v5.x-series/idrac9_5.00.00.00_ug/sol-using-ssh?guid=guid-36278d42-c759-42fd-8320-71aa9a262e7f&lang=en-us [10] https://www.reddit.com/r/sysadmin/comments/ewrj20/bash_script_to_scan_dell_idrac9_and_execute/