Add --diagnose flag, remove obsolete helper scripts, fix docs

- Add --diagnose option that shows all PCI paths, storage controllers,
  block devices, and validates current mappings. Replaces the separate
  diagnose-drives.sh script.
- Remove diagnose-drives.sh (incorporated into --diagnose).
- Remove get-serials.sh (redundant with SMART data in main table).
- Remove test-paths.sh (referenced non-existent 0c:00.0 controller).
- Remove todo.md (massively outdated).
- Fix storage controller text overflowing box borders in large1 and
  micro layouts by adding truncation (%-69.69s, %-57.57s).
- Fix chassis name to CX4712 in README.
- Update server mapping statuses from "Requires mapping" to actual
  partially-mapped states.
- Add ⚠ health indicator to README output column docs.
- Update Claude.md metrics to match current state.

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-02-06 18:50:37 -05:00
parent 555ecd54b2
commit c6ea28c5d6
6 changed files with 98 additions and 107 deletions

View File

@@ -41,14 +41,14 @@ bash <(wget -qO- http://10.10.10.63:3000/LotusGuild/driveAtlas/raw/branch/main/d
| Chassis Type | Description | Servers Using It |
|-------------|-------------|------------------|
| **10-Bay Hot-swap** | Sliger CX471225 4U 10x 3.5" NAS (with unused 2x 5.25" bays) | compute-storage-01, compute-storage-gpu-01, storage-01 |
| **10-Bay Hot-swap** | Sliger CX4712 4U 10x 3.5" NAS (with unused 2x 5.25" bays) | compute-storage-01, compute-storage-gpu-01, storage-01 |
| **Large1 Grid** | Unique 3x5 grid layout (1/1 configuration) | large1 |
| **Micro** | Compact 2-drive layout | micro1, monitor-02 |
### Server Details
#### compute-storage-01 (formerly medium2)
- **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap)
- **Chassis:** Sliger CX4712 4U (10-Bay Hot-swap)
- **Motherboard:** B650D4U3-2Q/BCM
- **Controllers:**
- 01:00.0 - LSI SAS3008 HBA (bays 5-10 via 2x mini-SAS HD)
@@ -57,20 +57,20 @@ bash <(wget -qO- http://10.10.10.63:3000/LotusGuild/driveAtlas/raw/branch/main/d
- **Status:** ✅ Fully mapped and verified
#### storage-01
- **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap)
- **Motherboard:** Different from compute-storage-01
- **Controllers:** Motherboard SATA only (no HBA currently)
- **Status:** ⚠️ Requires PCI path mapping
- **Chassis:** Sliger CX4712 4U (10-Bay Hot-swap)
- **Motherboard:** ASRock A320M-HDV R4.0
- **Controllers:** AMD SATA (bays 1-4), LSI SAS3416 HBA (bays 5+, U.2 NVMe)
- **Status:** ⚠️ Partially mapped (5 of 10 bays)
#### large1
- **Chassis:** Unique 3x5 grid (15 bays total)
- **Note:** 1/1 configuration, will not be replicated
- **Status:** ⚠️ Requires PCI path mapping
- **Status:** ⚠️ Partially mapped (14 bays + 2 M.2)
#### compute-storage-gpu-01
- **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap)
- **Motherboard:** Same as compute-storage-01
- **Status:** ⚠️ Requires PCI path mapping
- **Chassis:** Sliger CX4712 4U (10-Bay Hot-swap)
- **Motherboard:** ASUS PRIME B550-PLUS
- **Status:** ⚠️ Partially mapped (5 SATA + 1 M.2)
## Output Example
@@ -130,15 +130,15 @@ declare -A SERVER_MAPPINGS=(
## Setting Up a New Server
### Step 1: Run Diagnostic Script
### Step 1: Run Diagnostic Mode
First, gather PCI path information:
```bash
bash diagnose-drives.sh > server-diagnostic.txt
bash driveAtlas.sh --diagnose
```
This will show all available PCI paths and their associated drives.
This will show all available PCI paths, storage controllers, and their associated drives.
### Step 2: Physical Bay Identification
@@ -192,7 +192,7 @@ DEBUG=1 bash driveAtlas.sh
| **SIZE** | Drive capacity |
| **TYPE** | SSD or HDD (detected via SMART) |
| **TEMP** | Current temperature from SMART |
| **HEALTH** | SMART health status (✓ = passed, ✗ = failed) |
| **HEALTH** | SMART health status (✓ = passed, ⚠ = passed with warnings, ✗ = failed) |
| **MODEL** | Drive model number |
| **SERIAL** | Drive serial number (for physical verification) |
| **CEPH OSD** | Ceph OSD ID if drive hosts an OSD |
@@ -235,17 +235,15 @@ DEBUG=1 bash driveAtlas.sh
## Files
- [driveAtlas.sh](driveAtlas.sh) - Main script
- [diagnose-drives.sh](diagnose-drives.sh) - PCI path diagnostic tool
- [driveAtlas.sh](driveAtlas.sh) - Main script (includes `--diagnose` mode for PCI path discovery)
- [README.md](README.md) - This file
- [CLAUDE.md](CLAUDE.md) - AI-assisted development notes
- [todo.txt](todo.txt) - Development notes and task tracking
## Contributing
When adding support for a new server:
1. Run `diagnose-drives.sh` and save output
1. Run `driveAtlas.sh --diagnose` and save output
2. Physically label or identify drives by serial number
3. Create mapping in `SERVER_MAPPINGS`
4. Test thoroughly