# Drive Atlas A powerful server drive mapping tool that generates visual ASCII representations of server layouts and provides comprehensive drive information. Maps physical drive bays to logical Linux device names using PCI bus paths for reliable, persistent identification. ## Features - πŸ—ΊοΈ **Visual ASCII art maps** showing physical drive bay layouts - πŸ”— **Persistent drive identification** using PCI paths (not device letters) - 🌑️ **SMART health monitoring** with temperature and status - πŸ’Ύ **Multi-drive support** for SATA, NVMe, SAS, and USB drives - 🏷️ **Serial number tracking** for physical verification - πŸ“Š **Bay-sorted output** matching physical layout - πŸ”΅ **Ceph integration** showing OSD IDs and up/in status - πŸ₯Ύ **Boot drive detection** identifying system drives - πŸ–₯️ **Per-server configuration** for accurate physical-to-logical mapping ## Quick Start Execute remotely using curl: ```bash bash <(curl -s http://10.10.10.63:3000/LotusGuild/driveAtlas/raw/branch/main/driveAtlas.sh) ``` Or using wget: ```bash bash <(wget -qO- http://10.10.10.63:3000/LotusGuild/driveAtlas/raw/branch/main/driveAtlas.sh) ``` ## Requirements - Linux environment with bash - `sudo` privileges for SMART operations - `smartctl` (from smartmontools package) - `lsblk` and `lspci` (typically pre-installed) - Optional: `nvme-cli` for NVMe drives - Optional: `ceph-volume` and `ceph` for Ceph OSD tracking ## Server Configurations ### Chassis Types | Chassis Type | Description | Servers Using It | |-------------|-------------|------------------| | **10-Bay Hot-swap** | Sliger CX471225 4U 10x 3.5" NAS (with unused 2x 5.25" bays) | compute-storage-01, compute-storage-gpu-01, storage-01 | | **Large1 Grid** | Unique 3x5 grid layout (1/1 configuration) | large1 | | **Micro** | Compact 2-drive layout | micro1, monitor-02 | ### Server Details #### compute-storage-01 (formerly medium2) - **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap) - **Motherboard:** B650D4U3-2Q/BCM - **Controllers:** - 01:00.0 - LSI SAS3008 HBA (bays 5-10 via 2x mini-SAS HD) - 0d:00.0 - AMD SATA controller (bays 1-4) - 0e:00.0 - M.2 NVMe slot - **Status:** βœ… Fully mapped and verified #### storage-01 - **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap) - **Motherboard:** Different from compute-storage-01 - **Controllers:** Motherboard SATA only (no HBA currently) - **Status:** ⚠️ Requires PCI path mapping #### large1 - **Chassis:** Unique 3x5 grid (15 bays total) - **Note:** 1/1 configuration, will not be replicated - **Status:** ⚠️ Requires PCI path mapping #### compute-storage-gpu-01 - **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap) - **Motherboard:** Same as compute-storage-01 - **Status:** ⚠️ Requires PCI path mapping ## Output Example ``` β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ compute-storage-01 - 10-Bay Hot-swap Chassis β”‚ β”‚ β”‚ β”‚ M.2 NVMe: nvme0n1 β”‚ β”‚ β”‚ β”‚ Front Hot-swap Bays: β”‚ β”‚ β”‚ β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚ β”‚1 :sdh β”‚ β”‚2 :sdg β”‚ β”‚3 :sdi β”‚ β”‚4 :sdj β”‚ β”‚5 :sde β”‚ β”‚6 :sdf β”‚ β”‚7 :sdd β”‚ β”‚8 :sda β”‚ β”‚9 :sdc β”‚ β”‚10:sdb β”‚ β”‚ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ === Drive Details with SMART Status (by Bay Position) === BAY DEVICE SIZE TYPE TEMP HEALTH MODEL SERIAL CEPH OSD STATUS USAGE ---------------------------------------------------------------------------------------------------------------------------------------------------- 1 /dev/sdh 223.6G SSD 27Β°C βœ“ Crucial_CT240M500SSD1 14130C0E06DD - - /boot/efi 2 /dev/sdg 1.8T HDD 26Β°C βœ“ ST2000DM001-1ER164 Z4ZC4B6R osd.25 up/in - 3 /dev/sdi 12.7T HDD 29Β°C βœ“ OOS14000G 000DXND6 osd.9 up/in - ... ``` ## How It Works ### PCI Path-Based Mapping Drive Atlas uses `/dev/disk/by-path/` to create persistent mappings between physical drive bays and Linux device names. This is superior to using device letters (sda, sdb, etc.) which can change between boots. **Example PCI path:** ``` pci-0000:01:00.0-sas-phy6-lun-0 β†’ /dev/sde β†’ Bay 5 ``` This tells us: - `0000:01:00.0` - PCI bus address of the LSI SAS3008 HBA - `sas-phy6` - SAS PHY 6 on that controller - `lun-0` - Logical Unit Number - Maps to physical bay 5 on compute-storage-01 ### Configuration Server mappings are defined in the `SERVER_MAPPINGS` associative array in [driveAtlas.sh](driveAtlas.sh): ```bash declare -A SERVER_MAPPINGS=( ["compute-storage-01"]=" pci-0000:0d:00.0-ata-2 1 pci-0000:0d:00.0-ata-1 2 pci-0000:01:00.0-sas-phy6-lun-0 5 pci-0000:0e:00.0-nvme-1 m2-1 " ) ``` ## Setting Up a New Server ### Step 1: Run Diagnostic Script First, gather PCI path information: ```bash bash diagnose-drives.sh > server-diagnostic.txt ``` This will show all available PCI paths and their associated drives. ### Step 2: Physical Bay Identification For each populated drive bay: 1. Note the physical bay number (labeled on chassis) 2. Run the main script to see serial numbers 3. Match visible serial numbers on drives to the output 4. Map PCI paths to bay numbers **Pro tip:** The script shows serial numbers - compare them to visible labels on drive trays to verify physical locations. ### Step 3: Create Mapping Add a new entry to `SERVER_MAPPINGS` in [driveAtlas.sh](driveAtlas.sh): ```bash ["your-hostname"]=" pci-0000:XX:XX.X-ata-1 1 pci-0000:XX:XX.X-ata-2 2 # ... etc " ``` Also add the chassis type to `CHASSIS_TYPES`: ```bash ["your-hostname"]="10bay" ``` ### Step 4: Test Run the main script and verify the layout matches your physical configuration: ```bash bash driveAtlas.sh ``` Use debug mode to see the mappings: ```bash DEBUG=1 bash driveAtlas.sh ``` ## Output Columns Explained | Column | Description | |--------|-------------| | **BAY** | Physical bay number (1-10, m2-1, etc.) | | **DEVICE** | Linux device name (/dev/sdX, /dev/nvmeXnY) | | **SIZE** | Drive capacity | | **TYPE** | SSD or HDD (detected via SMART) | | **TEMP** | Current temperature from SMART | | **HEALTH** | SMART health status (βœ“ = passed, βœ— = failed) | | **MODEL** | Drive model number | | **SERIAL** | Drive serial number (for physical verification) | | **CEPH OSD** | Ceph OSD ID if drive hosts an OSD | | **STATUS** | Ceph OSD status (up/in, down/out, etc.) | | **USAGE** | Mount point or "BOOT" for system drive | ## Troubleshooting ### Drive shows as EMPTY but is physically present - Check if the drive is detected: `ls -la /dev/disk/by-path/` - Verify the PCI path in the mapping matches the actual path - Ensure the drive has power and SATA/power connections are secure ### PCI paths don't match between servers with "identical" hardware - Even identical motherboards can have different PCI addressing - BIOS settings can affect PCI enumeration - HBA installation in different PCIe slots changes addresses - Cable routing to different SATA ports changes the ata-N or phy-N number ### SMART data not showing - Ensure `smartmontools` is installed: `sudo apt install smartmontools` - Some drives don't report temperature - USB-connected drives may not support SMART - Run `sudo smartctl -i /dev/sdX` manually to check ### Ceph OSD status shows "unknown/out" - Ensure `ceph` and `ceph-volume` commands are available - Check if the Ceph cluster is healthy: `ceph -s` - Verify OSD is actually up: `ceph osd tree` ### Serial numbers don't match visible labels - Some manufacturers use different serials for SMART vs. physical labels - Cross-reference by drive model and size - Use the removal method: power down, remove drive, check which bay becomes EMPTY ## Files - [driveAtlas.sh](driveAtlas.sh) - Main script - [diagnose-drives.sh](diagnose-drives.sh) - PCI path diagnostic tool - [README.md](README.md) - This file - [CLAUDE.md](CLAUDE.md) - AI-assisted development notes - [todo.txt](todo.txt) - Development notes and task tracking ## Contributing When adding support for a new server: 1. Run `diagnose-drives.sh` and save output 2. Physically label or identify drives by serial number 3. Create mapping in `SERVER_MAPPINGS` 4. Test thoroughly 5. Document any unique hardware configurations 6. Update this README ## Technical Notes ### Why PCI Paths? Linux device names (sda, sdb, etc.) are assigned in discovery order, which can change: - Between kernel versions - After BIOS updates - When drives are added/removed - Due to timing variations at boot PCI paths are deterministic and based on physical hardware topology. ### Bay Numbering Conventions - **10-bay chassis:** Bays numbered 1-10 (left to right, typically) - **M.2 slots:** Labeled as `m2-1`, `m2-2`, etc. - **USB drives:** Labeled as `usb1`, `usb2`, etc. - **Large1:** Grid numbering 1-15 (documented in mapping) ### Ceph Integration The script automatically detects Ceph OSDs using: 1. `ceph-volume lvm list` to map devices to OSD IDs 2. `ceph osd tree` to get up/down and in/out status Status format: `up/in` means OSD is running and participating in the cluster.