Added comprehensive command-line interface with:
- -h, --help: Show usage information
- -v, --version: Show version
- -d, --debug: Enable debug output
- -s, --skip-smart: Skip SMART data collection (faster)
- --no-ceph: Skip Ceph OSD information
- --show-pci: Display PCI paths for debugging
The script now properly respects these flags throughout execution.
Fixes: #10
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace fragile per-device ceph-volume parsing (grep -B 20) with a
single upfront query that builds lookup tables.
New build_ceph_cache function:
- Parses ceph-volume lvm list output using proper block detection
- Extracts OSD IDs by matching "====== osd.X =======" headers
- Maps block devices to their corresponding OSDs
- Queries ceph osd tree once for all status info
- Creates CEPH_DEVICE_TO_OSD, CEPH_OSD_STATUS, CEPH_OSD_IN arrays
This is both more reliable and more efficient.
Fixes: #9
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use lsblk instead of mount command to detect mount points. This
properly detects mounts on partitions (e.g., /dev/sda1) rather
than only whole-device mounts.
- Shows multiple mount points (up to 3) comma-separated
- Correctly identifies BOOT drives with root partition
- Handles NVMe partition naming (nvme0n1p1, etc.)
Fixes: #8
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The device type detection was updated in commit 90055be to:
- Check for NVMe devices by name prefix first
- Handle "Solid State" and "0 rpm" in Rotation Rate field
- Fall back to checking for SSD/Solid State in SMART output
Fixes: #7
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The serial number parsing was updated in commit 90055be to use
'cut -d: -f2 | xargs' which captures the full serial including
spaces, instead of 'awk {print $3}' which only got the first word.
Fixes: #6
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
NVMe drives mapped to m2-1, m2-2 slots now appear in the main drive
table with their bay position, instead of in a separate unmapped
section.
- Extended bay loop to include m2-* slots after numeric bays
- NVMe section now only shows truly unmapped NVMe drives
- Mapped NVMe drives show full SMART data like other drives
Fixes: #5
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Use awk BEGIN block for comparing Ceph OSD reweight values instead
of bc. Awk is more universally available and the previous fallback
to "echo 0" could incorrectly evaluate to true.
Fixes: #4
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Script now verifies required (lsblk, lspci, etc.) and optional
(smartctl, ceph, bc, nvme) dependencies at startup.
- Exits with clear error if required dependencies are missing
- Warns about missing optional dependencies with reduced functionality
- Directs users to freshStartScript for easy installation
- Checks for sudo access needed for SMART operations
Fixes: #3
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Temperature parsing now correctly handles:
- SATA Temperature_Celsius attribute (extracts last numeric value)
- Simple "Temperature: XX Celsius" format
- "Current Temperature: XX Celsius" format
- NVMe temperature reporting
Also improved device type detection for NVMe, SSD (including 0 RPM),
and fixed serial number parsing to capture full serial with spaces.
Fixes: #2
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Declare DRIVE_MAP as global at function start and populate directly,
instead of creating a local array and copying to global. Also added
proper variable quoting and function documentation.
Fixes: #1
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Added get_storage_controllers() function that detects SAS, SATA, RAID,
and NVMe controllers via lspci. Updated all layout functions (10bay,
large1, micro) to display detected storage controllers with their
PCI address and model info.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Reflects actual physical layout: 3 stacks x 5 rows (15 bays)
- Added note that physical bay mapping is TBD
- Shows Stack A, B, C columns
- Keeps 2x M.2 NVMe slots
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add PCI path mappings for large1 (12 SATA/SAS + 2 NVMe drives)
- Update large1 layout to show actual drive assignments
- Controllers: LSI SAS2008 (7), AMD SATA (3), ASMedia SATA (2), NVMe (2)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add PCI path mappings for storage-01 (4 SATA drives on AMD controller)
- Fix NVMe serial: use smartctl instead of nvme list for accurate serial numbers
- NVMe now shows actual serial number instead of /dev/ng device path
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updated README.md:
- Added feature list with emojis for visual clarity
- Documented all output columns with descriptions
- Added Ceph integration details
- Included troubleshooting for common issues
- Updated example output with current format
- Added status indicators (✅⚠️) for server mapping status
Created CLAUDE.md:
- Documented AI-assisted development process
- Chronicled evolution from basic script to comprehensive tool
- Detailed technical challenges and solutions
- Listed all phases of development
- Provided metrics and future enhancement ideas
- Lessons learned for future AI collaboration
This documents the complete journey from broken PCI paths to a
production-ready storage infrastructure management tool.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixed parsing of ceph osd tree output:
- Column 5 is STATUS (up/down) not column 6
- Column 6 is REWEIGHT (1.0 = in, 0 = out)
- Now correctly shows up/in for active OSDs
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
New features:
- Added STATUS column showing Ceph OSD up/down and in/out status
Format: "up/in", "up/out", "down/in", etc.
- Added USAGE column to identify boot drives and mount points
Shows "BOOT" for root filesystem, mount point for others, "-" for OSDs
- Improved table layout with all relevant drive information
Now you can see at a glance:
- Which drives are boot drives
- Which OSDs are up and in the cluster
- Any problematic OSDs that are down or out
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Major enhancements:
- Drive details now sorted by physical bay position (1-10) instead of alphabetically
- Added BAY column to show physical location
- Added CEPH OSD column to show which OSD each drive hosts
- Fixed ASCII art right border alignment (final fix)
- Drives now display in logical order matching physical layout
This makes it much easier to correlate physical drives with their Ceph OSDs
and understand the layout at a glance.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Changes:
- Added SERIAL column to SATA/SAS drive details table
- Added SERIAL column to NVMe drive details table
- Updated get_drive_smart_info() to extract and return serial numbers
- Widened output format to accommodate serial numbers
- NVMe serials now display correctly from nvme list output
This makes it much easier to match drives to their physical locations
by comparing visible serial numbers on drive labels with the output.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Fixed issues:
- ASCII art boxes now render correctly with fixed-width layout
- Corrected bay 1 mapping: ata-2 -> bay 1 (sdh SSD confirmed)
- Adjusted mobo SATA port mappings based on physical verification
- Simplified layout to use consistent 10-character wide bay boxes
Bay 1 is confirmed to contain sdh (Crucial SSD boot drive) which maps
to pci-0000:0d:00.0-ata-2, so the mapping has been corrected.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Hardware discovered:
- LSI SAS3008 HBA at 01:00.0 (bays 5-10 via mini-SAS HD cables)
- AMD SATA controller at 0d:00.0 (bays 1-4)
- NVMe at 0e:00.0 (M.2 slot)
Changes:
- Updated SERVER_MAPPINGS with correct PCI paths based on actual hardware
- Fixed diagnose-drives.sh CRLF line endings (was causing script errors)
- Updated README with accurate controller information
- Mapped all 10 bays plus M.2 NVMe slot
- Added detailed cable mapping comments from user documentation
The old mapping referenced non-existent controller 0c:00.0. Now uses
actual SAS PHY paths and ATA port numbers that match physical bays.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Major improvements:
- Separated chassis types from server hostnames for better reusability
- Implemented template-based layout system (10bay, large1)
- Renamed medium2 to compute-storage-01 for clarity
- All three Sliger CX471225 servers now use unified 10bay layout
- Added comprehensive PCI path-based drive mapping system
- Created diagnose-drives.sh helper script for mapping new servers
- Added DEBUG mode for troubleshooting drive mappings
Technical changes:
- Replaced DRIVE_MAPPINGS with separate SERVER_MAPPINGS and CHASSIS_TYPES
- Removed spare-10bay layout (all Sliger chassis use same template)
- Improved drive detection and SMART data collection
- Better error handling for missing drives and unmapped servers
- Cleaner code structure with sectioned comments
Documentation:
- Complete rewrite of README with setup guide and troubleshooting
- Added detailed todo.txt with action plan and technical notes
- Documented Sliger CX471225 4U chassis specifications
- Included step-by-step instructions for mapping new servers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Major improvements:
- Separated chassis types from server hostnames for better reusability
- Implemented template-based layout system (10bay, large1, spare-10bay)
- Renamed medium2 to compute-storage-01 for clarity
- Added comprehensive PCI path-based drive mapping system
- Created diagnose-drives.sh helper script for mapping new servers
- Added DEBUG mode for troubleshooting drive mappings
- Documented Sliger CX471225 4U chassis model
Technical changes:
- Replaced DRIVE_MAPPINGS with separate SERVER_MAPPINGS and CHASSIS_TYPES
- Improved drive detection and SMART data collection
- Better error handling for missing drives and unmapped servers
- Cleaner code structure with sectioned comments
Documentation:
- Complete rewrite of README with setup guide and troubleshooting
- Added detailed todo.txt with action plan and technical notes
- Included step-by-step instructions for mapping new servers
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>