Files
driveAtlas/README.md
Jared Vititoe 1b35db6723 Fix PCI path mappings and line endings for compute-storage-01
Hardware discovered:
- LSI SAS3008 HBA at 01:00.0 (bays 5-10 via mini-SAS HD cables)
- AMD SATA controller at 0d:00.0 (bays 1-4)
- NVMe at 0e:00.0 (M.2 slot)

Changes:
- Updated SERVER_MAPPINGS with correct PCI paths based on actual hardware
- Fixed diagnose-drives.sh CRLF line endings (was causing script errors)
- Updated README with accurate controller information
- Mapped all 10 bays plus M.2 NVMe slot
- Added detailed cable mapping comments from user documentation

The old mapping referenced non-existent controller 0c:00.0. Now uses
actual SAS PHY paths and ATA port numbers that match physical bays.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-06 16:04:15 -05:00

242 lines
7.9 KiB
Markdown

# Drive Atlas
A powerful server drive mapping tool that generates visual ASCII representations of server layouts and provides comprehensive drive information. Maps physical drive bays to logical Linux device names using PCI bus paths for reliable, persistent identification.
## Features
- Visual ASCII art maps showing physical drive bay layouts
- Persistent drive identification using PCI paths (not device letters)
- SMART health status and temperature monitoring
- Support for SATA, NVMe, and USB drives
- Detailed drive information including model, size, and health status
- Per-server configuration for accurate physical-to-logical mapping
## Quick Start
Execute remotely using curl:
```bash
bash <(curl -s http://10.10.10.63:3000/LotusGuild/driveAtlas/raw/branch/main/driveAtlas.sh)
```
Or using wget:
```bash
bash <(wget -qO- http://10.10.10.63:3000/LotusGuild/driveAtlas/raw/branch/main/driveAtlas.sh)
```
## Requirements
- Linux environment with bash
- `sudo` privileges for SMART operations
- `smartctl` (from smartmontools package)
- `lsblk` and `lspci` (typically pre-installed)
- Optional: `nvme-cli` for NVMe drives
## Server Configurations
### Chassis Types
| Chassis Type | Description | Servers Using It |
|-------------|-------------|------------------|
| **10-Bay Hot-swap** | Sliger CX471225 4U 10x 3.5" NAS (with unused 2x 5.25" bays) | compute-storage-01, compute-storage-gpu-01, storage-01 |
| **Large1 Grid** | Unique 3x5 grid layout (1/1 configuration) | large1 |
| **Micro** | Compact 2-drive layout | micro1, monitor-02 |
### Server Details
#### compute-storage-01 (formerly medium2)
- **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap)
- **Motherboard:** B650D4U3-2Q/BCM
- **Controllers:**
- 01:00.0 - LSI SAS3008 HBA (bays 5-10 via 2x mini-SAS HD)
- 0d:00.0 - AMD SATA controller (bays 1-4)
- 0e:00.0 - M.2 NVMe slot
- **Status:** Fully mapped
#### storage-01
- **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap)
- **Motherboard:** Different from compute-storage-01
- **Controllers:** Motherboard SATA only (no HBA currently)
- **Status:** Requires PCI path mapping
#### large1
- **Chassis:** Unique 3x5 grid (15 bays total)
- **Note:** 1/1 configuration, will not be replicated
- **Status:** Requires PCI path mapping
#### compute-storage-gpu-01
- **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap)
- **Motherboard:** Same as compute-storage-01
- **Status:** Requires PCI path mapping
## How It Works
### PCI Path-Based Mapping
Drive Atlas uses `/dev/disk/by-path/` to create persistent mappings between physical drive bays and Linux device names. This is superior to using device letters (sda, sdb, etc.) which can change between boots.
**Example PCI path:**
```
pci-0000:0c:00.0-ata-1 → /dev/sda
```
This tells us:
- `0000:0c:00.0` - PCI bus address of the storage controller
- `ata-1` - Port 1 on that controller
- Maps to physical bay 3 on compute-storage-01
### Configuration
Server mappings are defined in the `SERVER_MAPPINGS` associative array in [driveAtlas.sh](driveAtlas.sh):
```bash
declare -A SERVER_MAPPINGS=(
["compute-storage-01"]="
pci-0000:0c:00.0-ata-1 3
pci-0000:0c:00.0-ata-2 4
pci-0000:0d:00.0-nvme-1 m2-1
"
)
```
## Setting Up a New Server
### Step 1: Run Diagnostic Script
First, gather PCI path information:
```bash
bash diagnose-drives.sh > server-diagnostic.txt
```
This will show all available PCI paths and their associated drives.
### Step 2: Physical Bay Identification
For each populated drive bay:
1. Note the physical bay number (labeled on chassis)
2. Identify a unique characteristic (size, model, or serial number)
3. Match it to the PCI path from the diagnostic output
**Pro tip:** If uncertain, remove one drive at a time and re-run the diagnostic to see which PCI path disappears.
### Step 3: Create Mapping
Add a new entry to `SERVER_MAPPINGS` in [driveAtlas.sh](driveAtlas.sh):
```bash
["your-hostname"]="
pci-0000:XX:XX.X-ata-1 1
pci-0000:XX:XX.X-ata-2 2
# ... etc
"
```
Also add the chassis type to `CHASSIS_TYPES`:
```bash
["your-hostname"]="10bay"
```
### Step 4: Test
Run the main script and verify the layout matches your physical configuration:
```bash
bash driveAtlas.sh
```
Use debug mode to see the mappings:
```bash
DEBUG=1 bash driveAtlas.sh
```
## Output Example
```
┌──────────────────────────────────────────────────────────────┐
│ compute-storage-01 │
│ 10-Bay Hot-swap Chassis │
│ │
│ M.2 NVMe Slot │
│ ┌──────────┐ │
│ │ nvme0n1 │ │
│ └──────────┘ │
│ │
│ Front Hot-swap Bays │
│ ┌──────────┐┌──────────┐┌──────────┐┌──────────┐... │
│ │1: EMPTY ││2: EMPTY ││3: sda ││4: sdb │... │
│ └──────────┘└──────────┘└──────────┘└──────────┘... │
└──────────────────────────────────────────────────────────────┘
=== Drive Details with SMART Status ===
DEVICE SIZE TYPE TEMP HEALTH MODEL
--------------------------------------------------------------------------------
/dev/sda 2TB HDD 35°C ✓ WD20EFRX-68EUZN0
/dev/nvme0n1 1TB SSD 42°C ✓ Samsung 980 PRO
```
## Troubleshooting
### Drive shows as EMPTY but is physically present
- Check if the drive is detected: `ls -la /dev/disk/by-path/`
- Verify the PCI path in the mapping matches the actual path
- Ensure the drive has power and SATA/power connections are secure
### PCI paths don't match between servers with "identical" hardware
- Even identical motherboards can have different PCI addressing
- BIOS settings can affect PCI enumeration
- HBA installation in different PCIe slots changes addresses
- Cable routing to different SATA ports changes the ata-N number
### SMART data not showing
- Ensure `smartmontools` is installed: `sudo apt install smartmontools`
- Some drives don't report temperature
- USB-connected drives may not support SMART
- Run `sudo smartctl -i /dev/sdX` manually to check
## Files
- [driveAtlas.sh](driveAtlas.sh) - Main script
- [diagnose-drives.sh](diagnose-drives.sh) - PCI path diagnostic tool
- [README.md](README.md) - This file
- [todo.txt](todo.txt) - Development notes
## Contributing
When adding support for a new server:
1. Run `diagnose-drives.sh` and save output
2. Physically label or identify drives
3. Create mapping in `SERVER_MAPPINGS`
4. Test thoroughly
5. Document any unique hardware configurations
6. Update this README
## Technical Notes
### Why PCI Paths?
Linux device names (sda, sdb, etc.) are assigned in discovery order, which can change:
- Between kernel versions
- After BIOS updates
- When drives are added/removed
- Due to timing variations at boot
PCI paths are deterministic and based on physical hardware topology.
### Bay Numbering Conventions
- **10-bay chassis:** Bays numbered 1-10 (left to right, top to bottom)
- **M.2 slots:** Labeled as `m2-1`, `m2-2`, etc.
- **USB drives:** Labeled as `usb1`, `usb2`, etc.
- **Large1:** Grid numbering 1-9 (3x3 displayed, additional bays documented in mapping)
## License
Internal tool for LotusGuild infrastructure.