Ereshkigal
Fast Facts#
Based on your Dell R720xd configuration running NixOS, here’s a cleaned-up overview with performance optimization tasks and peripheral configuration steps:
System Summary#
- Hardware: Dell R720xd
- CPUs: 2× Intel Xeon E5-2650 v2 (32 threads)
- RAM: 168GB DDR3
- Storage:
- Boot: 2×250GB SSD (RAID1)
- Data: 4×3TB HDD (RAID5 + hot spare)
- GPUs:
- NVIDIA Quadro P400 (Passthrough)
- 2× NVIDIA Tesla K80 (Passthrough)
- Networking:
- 4×1GbE (Bonded as
bond1→vmbr0) - 2×10GbE (Bonded as
bond0→vmbr1) - Management: iDRAC at
192.168.0.21
Performance Optimization Tasks#
- GPU SR-IOV Partitioning (Tesla K80)
Split GPUs for multiple VMs:
nvidia-smi -i 0 -mig 1 # Enable MIG mode nvidia-smi mig -cgi 19g.40gb -C # Create compute instances
Peripheral Configuration Tasks#
-
Network Bonding Verification
Confirm bond status:
cat /proc/net/bonding/bond0 # Check 10GbE bond cat /proc/net/bonding/bond1 # Check 1GbE bond -
RAID Health Monitoring
Add PERC controller checks:
services.smartd = { enable = true; devices = [{ device = "/dev/sda"; options = "-a -d megaraid,0"; }]; }; -
iDRAC Alert Integration
Configure SNMP traps for hardware events:
racadm set iDRAC.SNMP.Alert 1 racadm set iDRAC.SNMP.AgentEnable 1 -
USB Device Passthrough
For peripherals (e.g., security dongles):
boot.kernelParams = [ "usbcore.quirks=0x1234:0x5678:0x044" ];
Configuration Cleanup#
-
Remove Unused Bonds
Delete unused bond declarations:
networking.bonds.bond1 = lib.mkForce {}; # If not used -
Simplify Network Setup
Replace manual IP with networkd:
systemd.network.enable = true; networking.useNetworkd = true; -
Fix VLAN Configuration
Uncomment and repair VLAN setup:
networking.vlans = { vlan10 = { id = 10; interface = "vmbr0"; }; }; -
MicroVM Optimization
Enable virtiofs for faster VM storage:
microvm.shares = [{ source = "/nix/store"; mountPoint = "/nix/.ro-store"; tag = "ro-store"; proto = "virtiofs"; }];
Verification Steps#
-
GPU Functionality Test
nvidia-smi # Should show all GPUs lspci -vnn -d 10de: # Check passthrough devices -
Network Throughput Test
iperf3 -c 192.168.0.1 -P 8 # Test 10GbE bond -
Storage Health Check
perccli64 /c0 show all # Verify RAID status btrfs scrub start /nix # Check filesystem -
Boot Time Benchmark
systemd-analyze blame # Identify slow services
Note: After changes, rebuild with
nixos-rebuild testbefore applying permanently. Monitordmesgfor hardware errors during boot.
Recommended Next Steps#
- Update iDRAC firmware via
racadm update -f idrac.fwimg - Configure GPU fan curves in iDRAC to prevent thermal throttling
- Set up NixOS rebuild monitoring with
services.healthcheck.enable = true
[1] https://ppl-ai-file-upload.s3.amazonaws.com/web/direct-files/attachments/18980759/c5e3f0cc-c128-453d-9717-0cb40065afd4/paste.txt
Description#
Ereshkigal is a Dell R720xd with the following specs:
| Component | Spec | Notes |
|---|---|---|
| CPU | 2x Intel(R) Xeon(R) CPU E5-2650 v2 @ 3.40 GHz | 32 vcores |
| RAM | 168 GB | |
| Storage | 2x 250GB SSD RAID 1 | RootFS |
| Storage | 4x 3TB RAID 5 (1 hot spare) | Kur storage pool |
Networking#
There are two network devices available on ereshkigal: a 4x1gbps integrated card and a 2x10gbps-T card in riser 1.
iDrac#
The iDrac is a hardware management system of Ereshkigal where power, remote console, and RAID can be configured independent of the power status of the main server. iDrac Access (use Safari)
Storage#
Storage can be initially managed through iDrac under storage. However, while the system is on, any management on running must be managed in the host OS. For that reason, if a disk for the RootOS (RAID-1) or the proxmox storage pool (RAID-5) must be replaced, do the following:
- Identify the failing drive using syslog, SMART, iDrac identify “blink”, etc
- Procure a new disk that is the same size as other member disks
- Remove the identified disk from the machine
- Insert the replacement disk
- Verify the disk shows in iDrac under storage → Physical Disks
- Log into ereshkigal using ssh
- Run:
/opt/MegaRAID/perccli/perccli64 /c0/sX add hotsparedrivewhere X is the disk slot number to add a global hot spare- or
/opt/MegaRAID/perccli/perccli64 /c0/sX add hotsparedrive dgs=Ywhere Y is the disk group (VD) number to add a dedicated hotspare
- Check that the rebuild is running using
/opt/MegaRAID/perccli/perccli64 /c0/sX show rebuild - When the rebuild completes, verify that the RAID is back in healthy status
VMs Hosted on Ereshkigal#
PCI Passthrough#
An Nvidia Quadro P400 is available via
hostpci0: 04:00,pcie=1,rombar=0,driver=vfio
An Nvidia Tesla K80 is available via
hostpci0: 44:00,pcie=1
hostpci1: 45:00,pcie=1
Relevant lspci lines
04:00.0 VGA compatible controller: NVIDIA Corporation GP107GL [Quadro P400] (rev a1)
04:00.1 Audio device: NVIDIA Corporation GP107GL High Definition Audio Controller (rev a1)
44:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
45:00.0 3D controller: NVIDIA Corporation GK210GL [Tesla K80] (rev a1)
Read about passthrough here
Dell iDRAC SSH Shell Command Cheatsheet#
This cheatsheet summarizes the core commands available in the iDRAC shell (CLP/SMCLP) when connected via SSH on a Dell PowerEdge R720xd. These commands are distinct from racadm but provide similar management functionality directly in the shell.
Core Shell Commands#
| Command | Description & Usage Example |
|---|---|
show |
Display information about system objects or properties.Usage: show [] |
set |
Set a property value.Usage: set [] = |
cd |
Change current directory/object context.Usage: cd [] |
create |
Create a new object.Usage: create [=] ... |
delete |
Delete an object.Usage: delete |
exit |
Exit the shell session.Usage: exit |
reset |
Reset a target (e.g., power cycle server or controller).Usage: reset [] |
start |
Start a target (e.g., power on server/component).Usage: start [] |
stop |
Stop a target (e.g., power off server/component).Usage: stop [] |
version |
Show shell and firmware version info.Usage: version |
help |
Show help for commands or topics.Usage: help [] |
load |
Load configuration from a URI.Usage: load -source [] |
dump |
Dump configuration to a URI.Usage: dump -destination [] |
Usage Examples#
- Show all system info:
show / - Show a specific property:
show /system1 - Set a property:
set /system1 enabled=true - Change directory/context:
cd /system1 - Reset the server:
reset /system1 - Power on the server:
start /system1 - Power off the server:
stop /system1 - Create a new user (example, if supported):
create /user1 username=admin password=secret - Delete a user (example):
delete /user1 - Load config from a file:
load -source tftp://192.168.1.10/config.xml /system1 - Dump config to a file:
dump -destination tftp://192.168.1.10/backup.xml /system1 - Get shell version:
version - Get help on a command:
help set
Command Structure#
- Targets are objects in the system hierarchy (e.g.,
/system1,/chassis1,/user1). - Properties are attributes of those targets (e.g.,
enabled,username). - Options may modify command behavior (e.g.,
-source,-destination).
Notes#
- The shell is case-sensitive.
- Use
cd /to return to the root context. - Use
showwithout arguments for a list of objects in the current context. - For full command syntax and target/property names, use
helporshowat each level.
This cheatsheet covers the main command set available in the iDRAC shell via SSH on Dell PowerEdge servers like the R720xd, based on the /admin1-> help output and standard SMCLP conventions. For advanced features, consult the iDRAC CLI Reference Guide or use help within the shell for context-sensitive assistance.
[1] https://www.dell.com/support/contents/en-us/videos/videoplayer/tutorial-on-idrac-racadm-command-line/1706695616981987241 [2] https://www.gooksu.com/2015/04/racadm-quick-dirty-cheatsheet/ [3] https://dl.dell.com/topicspdf/idrac7-8-lifecycle-controller-v2505050_reference-guide_en-us.pdf [4] https://github.com/spyroot/idrac_ctl [5] https://christitus.com/idrac-dell-server/ [6] https://dl.dell.com/topicspdf/idrac7-8-lifecycle-controller-v2404040_reference-guide_en-us.pdf [7] https://dl.dell.com/topicspdf/idrac7-8-lifecycle-controller-v2.30.30.30_reference-guide4_en-us.pdf [8] https://www.reddit.com/r/homelab/comments/a49b4y/r720_idrac_help/ [9] https://www.dell.com/support/manuals/en-us/idrac9-lifecycle-controller-v5.x-series/idrac9_5.00.00.00_ug/sol-using-ssh?guid=guid-36278d42-c759-42fd-8320-71aa9a262e7f&lang=en-us [10] https://www.reddit.com/r/sysadmin/comments/ewrj20/bash_script_to_scan_dell_idrac9_and_execute/