Compare commits
18 Commits
38c3dc910e
...
main
| Author | SHA1 | Date | |
|---|---|---|---|
| f5638cad84 | |||
| 07f7a1d0af | |||
| 01f8d3e692 | |||
| f159b10de1 | |||
| 766d92251e | |||
| 93aeb84c65 | |||
| d5dbdd7869 | |||
| 982d3f5c05 | |||
| 7e1a88ad41 | |||
| 40ab528f40 | |||
| 418d4d4170 | |||
| 1800b59a25 | |||
| 5430a9242f | |||
| fd587eca64 | |||
| 03cb9e3ea8 | |||
| d5c784033e | |||
| be541cba97 | |||
| 1b35db6723 |
209
Claude.md
Normal file
209
Claude.md
Normal file
@@ -0,0 +1,209 @@
|
||||
# AI-Assisted Development Notes
|
||||
|
||||
This document chronicles the development of Drive Atlas with assistance from Claude (Anthropic's AI assistant).
|
||||
|
||||
## Project Overview
|
||||
|
||||
Drive Atlas started as a simple bash script with hardcoded drive mappings and evolved into a comprehensive storage infrastructure management tool through iterative development and user feedback.
|
||||
|
||||
## Development Session
|
||||
|
||||
**Date:** January 6, 2026
|
||||
**AI Model:** Claude Sonnet 4.5
|
||||
**Developer:** LotusGuild
|
||||
**Session Duration:** ~2 hours
|
||||
|
||||
## Initial State
|
||||
|
||||
The project began with:
|
||||
- Basic ASCII art layouts for different server chassis
|
||||
- Hardcoded drive mappings for "medium2" server
|
||||
- Simple SMART data display
|
||||
- Broken PCI path mappings (referenced non-existent hardware)
|
||||
- Windows line endings causing script execution failures
|
||||
|
||||
## Evolution Through Collaboration
|
||||
|
||||
### Phase 1: Architecture Refactoring
|
||||
**Problem:** Chassis layouts were tied to hostnames, making it hard to reuse templates.
|
||||
|
||||
**Solution:**
|
||||
- Separated chassis types from server hostnames
|
||||
- Created reusable layout generator functions
|
||||
- Introduced `CHASSIS_TYPES` and `SERVER_MAPPINGS` arrays
|
||||
- Renamed "medium2" → "compute-storage-01" for clarity
|
||||
|
||||
### Phase 2: Hardware Discovery
|
||||
**Problem:** Script referenced PCI controller `0c:00.0` which didn't exist.
|
||||
|
||||
**Approach:**
|
||||
1. Created diagnostic script to probe actual hardware
|
||||
2. Discovered real configuration:
|
||||
- LSI SAS3008 HBA at `01:00.0` (bays 5-10)
|
||||
- AMD SATA controller at `0d:00.0` (bays 1-4)
|
||||
- NVMe at `0e:00.0` (M.2 slot)
|
||||
3. User provided physical bay labels and visible serial numbers
|
||||
4. Iteratively refined PCI PHY to bay mappings
|
||||
|
||||
**Key Insight:** User confirmed bay 1 contained the SSD boot drive, which helped establish the correct mapping starting point.
|
||||
|
||||
### Phase 3: Physical Verification
|
||||
**Problem:** Needed to verify drive-to-bay mappings without powering down production server.
|
||||
|
||||
**Solution:**
|
||||
1. Added serial number display to script output
|
||||
2. User physically inspected visible serial numbers on drive bays
|
||||
3. Cross-referenced SMART serials with visible labels
|
||||
4. Corrected HBA PHY mappings:
|
||||
- Bay 5: phy6 (not phy2)
|
||||
- Bay 6: phy7 (not phy3)
|
||||
- Bay 7: phy5 (not phy4)
|
||||
- Bay 8: phy2 (not phy5)
|
||||
- Bay 9: phy4 (not phy6)
|
||||
- Bay 10: phy3 (not phy7)
|
||||
|
||||
### Phase 4: User Experience Improvements
|
||||
|
||||
**ASCII Art Rendering:**
|
||||
- Initial version had variable-width boxes that broke alignment
|
||||
- Fixed by using consistent 10-character wide bay boxes
|
||||
- Multiple iterations to perfect right border alignment
|
||||
|
||||
**Drive Table Enhancements:**
|
||||
- Original: Alphabetical by device name
|
||||
- Improved: Sorted by physical bay position (1-10)
|
||||
- Added BAY column to show physical location
|
||||
- Wider columns to prevent text wrapping
|
||||
|
||||
### Phase 5: Ceph Integration
|
||||
**User Request:** "Can we show ceph in/up out/down status in the table?"
|
||||
|
||||
**Implementation:**
|
||||
1. Added CEPH OSD column using `ceph-volume lvm list`
|
||||
2. Added STATUS column parsing `ceph osd tree`
|
||||
3. Initial bug: Parsed wrong columns (5 & 6 instead of correct ones)
|
||||
4. Fixed by understanding `ceph osd tree` format:
|
||||
- Column 5: STATUS (up/down)
|
||||
- Column 6: REWEIGHT (1.0 = in, 0 = out)
|
||||
|
||||
**User Request:** "Show which is the boot drive somehow?"
|
||||
|
||||
**Solution:**
|
||||
- Added USAGE column
|
||||
- Checks mount points
|
||||
- Shows "BOOT" for root filesystem
|
||||
- Shows mount point for other mounts
|
||||
- Shows "-" for Ceph OSDs (using LVM)
|
||||
|
||||
## Technical Challenges Solved
|
||||
|
||||
### 1. Line Ending Issues
|
||||
- **Problem:** `diagnose-drives.sh` had CRLF endings → script failures
|
||||
- **Solution:** `sed -i 's/\r$//'` to convert to LF
|
||||
|
||||
### 2. PCI Path Pattern Matching
|
||||
- **Problem:** Bash regex escaping for grep patterns
|
||||
- **Solution:** `grep -E "^\s*${osd_num}\s+"` for reliable matching
|
||||
|
||||
### 3. Floating Point Comparison in Bash
|
||||
- **Problem:** Bash doesn't natively support decimal comparisons
|
||||
- **Solution:** Used `bc -l` with error handling: `$(echo "$reweight > 0" | bc -l 2>/dev/null || echo 0)`
|
||||
|
||||
### 4. Associative Array Sorting
|
||||
- **Problem:** Bash associative arrays don't maintain insertion order
|
||||
- **Solution:** Extract keys, filter numeric ones, pipe to `sort -n`
|
||||
|
||||
## Key Learning Moments
|
||||
|
||||
1. **Hardware Reality vs. Assumptions:** The original script assumed controller addresses that didn't exist. Always probe actual hardware.
|
||||
|
||||
2. **Physical Verification is Essential:** Serial numbers visible on drive trays were crucial for verifying correct mappings.
|
||||
|
||||
3. **Iterative Refinement:** The script went through 15+ commits, each improving a specific aspect based on user testing and feedback.
|
||||
|
||||
4. **User-Driven Feature Evolution:** Features like Ceph integration and boot drive detection emerged organically from user needs.
|
||||
|
||||
## Commits Timeline
|
||||
|
||||
1. Initial refactoring and architecture improvements
|
||||
2. Fixed PCI path mappings based on discovered hardware
|
||||
3. Added serial numbers for physical verification
|
||||
4. Fixed ASCII art rendering issues
|
||||
5. Corrected bay mappings based on user verification
|
||||
6. Added bay-sorted output
|
||||
7. Implemented Ceph OSD tracking
|
||||
8. Added Ceph up/in status
|
||||
9. Added boot drive detection
|
||||
10. Fixed Ceph status parsing
|
||||
11. Documentation updates
|
||||
|
||||
## Collaborative Techniques Used
|
||||
|
||||
### Information Gathering
|
||||
- Asked clarifying questions about hardware configuration
|
||||
- Requested diagnostic command output
|
||||
- Had user physically verify drive locations
|
||||
|
||||
### Iterative Development
|
||||
- Made small, testable changes
|
||||
- User tested after each significant change
|
||||
- Incorporated feedback immediately
|
||||
|
||||
### Problem-Solving Approach
|
||||
1. Understand current state
|
||||
2. Identify specific issues
|
||||
3. Propose solution
|
||||
4. Implement incrementally
|
||||
5. Test and verify
|
||||
6. Refine based on feedback
|
||||
|
||||
## Metrics
|
||||
|
||||
- **Lines of Code:** ~330 (main script)
|
||||
- **Supported Chassis Types:** 4 (10-bay, large1, micro, spare)
|
||||
- **Mapped Servers:** 1 fully (compute-storage-01), 3 pending
|
||||
- **Features Added:** 10+
|
||||
- **Bugs Fixed:** 6 major, multiple minor
|
||||
- **Documentation:** Comprehensive README + this file
|
||||
|
||||
## Future Enhancements
|
||||
|
||||
Potential improvements identified during development:
|
||||
|
||||
1. **Auto-detection:** Attempt to auto-map bays by testing with `hdparm` LED control
|
||||
2. **Color Output:** Use terminal colors for health status (green/red)
|
||||
3. **Historical Tracking:** Log temperature trends over time
|
||||
4. **Alert Integration:** Notify when drive health deteriorates
|
||||
5. **Web Interface:** Display chassis map in a web dashboard
|
||||
6. **Multi-server View:** Show all servers in one consolidated view
|
||||
|
||||
## Lessons for Future AI-Assisted Development
|
||||
|
||||
### What Worked Well
|
||||
- Breaking complex problems into small, testable pieces
|
||||
- Using diagnostic scripts to understand actual vs. assumed state
|
||||
- Physical verification before trusting software output
|
||||
- Comprehensive documentation alongside code
|
||||
- Git commits with detailed messages for traceability
|
||||
|
||||
### What Could Be Improved
|
||||
- Earlier physical verification would have saved iteration
|
||||
- More upfront hardware documentation would help
|
||||
- Automated testing for bay mappings (if possible)
|
||||
|
||||
## Conclusion
|
||||
|
||||
This project demonstrates effective human-AI collaboration where:
|
||||
- The AI provided technical implementation and problem-solving
|
||||
- The human provided domain knowledge, testing, and verification
|
||||
- Iterative feedback loops led to a polished, production-ready tool
|
||||
|
||||
The result is a robust infrastructure management tool that provides instant visibility into complex storage configurations across multiple servers.
|
||||
|
||||
---
|
||||
|
||||
**Development Credits:**
|
||||
- **Human Developer:** LotusGuild
|
||||
- **AI Assistant:** Claude Sonnet 4.5 (Anthropic)
|
||||
- **Development Date:** January 6, 2026
|
||||
- **Project:** Drive Atlas v1.0
|
||||
145
README.md
145
README.md
@@ -4,12 +4,15 @@ A powerful server drive mapping tool that generates visual ASCII representations
|
||||
|
||||
## Features
|
||||
|
||||
- Visual ASCII art maps showing physical drive bay layouts
|
||||
- Persistent drive identification using PCI paths (not device letters)
|
||||
- SMART health status and temperature monitoring
|
||||
- Support for SATA, NVMe, and USB drives
|
||||
- Detailed drive information including model, size, and health status
|
||||
- Per-server configuration for accurate physical-to-logical mapping
|
||||
- 🗺️ **Visual ASCII art maps** showing physical drive bay layouts
|
||||
- 🔗 **Persistent drive identification** using PCI paths (not device letters)
|
||||
- 🌡️ **SMART health monitoring** with temperature and status
|
||||
- 💾 **Multi-drive support** for SATA, NVMe, SAS, and USB drives
|
||||
- 🏷️ **Serial number tracking** for physical verification
|
||||
- 📊 **Bay-sorted output** matching physical layout
|
||||
- 🔵 **Ceph integration** showing OSD IDs and up/in status
|
||||
- 🥾 **Boot drive detection** identifying system drives
|
||||
- 🖥️ **Per-server configuration** for accurate physical-to-logical mapping
|
||||
|
||||
## Quick Start
|
||||
|
||||
@@ -30,6 +33,7 @@ bash <(wget -qO- http://10.10.10.63:3000/LotusGuild/driveAtlas/raw/branch/main/d
|
||||
- `smartctl` (from smartmontools package)
|
||||
- `lsblk` and `lspci` (typically pre-installed)
|
||||
- Optional: `nvme-cli` for NVMe drives
|
||||
- Optional: `ceph-volume` and `ceph` for Ceph OSD tracking
|
||||
|
||||
## Server Configurations
|
||||
|
||||
@@ -47,26 +51,50 @@ bash <(wget -qO- http://10.10.10.63:3000/LotusGuild/driveAtlas/raw/branch/main/d
|
||||
- **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap)
|
||||
- **Motherboard:** B650D4U3-2Q/BCM
|
||||
- **Controllers:**
|
||||
- 0c:00.0 - Front hot-swap bays
|
||||
- 0d:00.0 - M.2 NVMe slot
|
||||
- 0b:00.0 - USB controller
|
||||
- **Status:** Partially mapped (bays 3-6 only)
|
||||
- 01:00.0 - LSI SAS3008 HBA (bays 5-10 via 2x mini-SAS HD)
|
||||
- 0d:00.0 - AMD SATA controller (bays 1-4)
|
||||
- 0e:00.0 - M.2 NVMe slot
|
||||
- **Status:** ✅ Fully mapped and verified
|
||||
|
||||
#### storage-01
|
||||
- **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap)
|
||||
- **Motherboard:** Different from compute-storage-01
|
||||
- **Controllers:** Motherboard SATA only (no HBA currently)
|
||||
- **Status:** Requires PCI path mapping
|
||||
- **Status:** ⚠️ Requires PCI path mapping
|
||||
|
||||
#### large1
|
||||
- **Chassis:** Unique 3x5 grid (15 bays total)
|
||||
- **Note:** 1/1 configuration, will not be replicated
|
||||
- **Status:** Requires PCI path mapping
|
||||
- **Status:** ⚠️ Requires PCI path mapping
|
||||
|
||||
#### compute-storage-gpu-01
|
||||
- **Chassis:** Sliger CX471225 4U (10-Bay Hot-swap)
|
||||
- **Motherboard:** Same as compute-storage-01
|
||||
- **Status:** Requires PCI path mapping
|
||||
- **Status:** ⚠️ Requires PCI path mapping
|
||||
|
||||
## Output Example
|
||||
|
||||
```
|
||||
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
|
||||
│ compute-storage-01 - 10-Bay Hot-swap Chassis │
|
||||
│ │
|
||||
│ M.2 NVMe: nvme0n1 │
|
||||
│ │
|
||||
│ Front Hot-swap Bays: │
|
||||
│ │
|
||||
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
|
||||
│ │1 :sdh │ │2 :sdg │ │3 :sdi │ │4 :sdj │ │5 :sde │ │6 :sdf │ │7 :sdd │ │8 :sda │ │9 :sdc │ │10:sdb │ │
|
||||
│ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ └──────────┘ │
|
||||
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
|
||||
|
||||
=== Drive Details with SMART Status (by Bay Position) ===
|
||||
BAY DEVICE SIZE TYPE TEMP HEALTH MODEL SERIAL CEPH OSD STATUS USAGE
|
||||
----------------------------------------------------------------------------------------------------------------------------------------------------
|
||||
1 /dev/sdh 223.6G SSD 27°C ✓ Crucial_CT240M500SSD1 14130C0E06DD - - /boot/efi
|
||||
2 /dev/sdg 1.8T HDD 26°C ✓ ST2000DM001-1ER164 Z4ZC4B6R osd.25 up/in -
|
||||
3 /dev/sdi 12.7T HDD 29°C ✓ OOS14000G 000DXND6 osd.9 up/in -
|
||||
...
|
||||
```
|
||||
|
||||
## How It Works
|
||||
|
||||
@@ -76,13 +104,14 @@ Drive Atlas uses `/dev/disk/by-path/` to create persistent mappings between phys
|
||||
|
||||
**Example PCI path:**
|
||||
```
|
||||
pci-0000:0c:00.0-ata-1 → /dev/sda
|
||||
pci-0000:01:00.0-sas-phy6-lun-0 → /dev/sde → Bay 5
|
||||
```
|
||||
|
||||
This tells us:
|
||||
- `0000:0c:00.0` - PCI bus address of the storage controller
|
||||
- `ata-1` - Port 1 on that controller
|
||||
- Maps to physical bay 3 on compute-storage-01
|
||||
- `0000:01:00.0` - PCI bus address of the LSI SAS3008 HBA
|
||||
- `sas-phy6` - SAS PHY 6 on that controller
|
||||
- `lun-0` - Logical Unit Number
|
||||
- Maps to physical bay 5 on compute-storage-01
|
||||
|
||||
### Configuration
|
||||
|
||||
@@ -91,9 +120,10 @@ Server mappings are defined in the `SERVER_MAPPINGS` associative array in [drive
|
||||
```bash
|
||||
declare -A SERVER_MAPPINGS=(
|
||||
["compute-storage-01"]="
|
||||
pci-0000:0c:00.0-ata-1 3
|
||||
pci-0000:0c:00.0-ata-2 4
|
||||
pci-0000:0d:00.0-nvme-1 m2-1
|
||||
pci-0000:0d:00.0-ata-2 1
|
||||
pci-0000:0d:00.0-ata-1 2
|
||||
pci-0000:01:00.0-sas-phy6-lun-0 5
|
||||
pci-0000:0e:00.0-nvme-1 m2-1
|
||||
"
|
||||
)
|
||||
```
|
||||
@@ -115,10 +145,11 @@ This will show all available PCI paths and their associated drives.
|
||||
For each populated drive bay:
|
||||
|
||||
1. Note the physical bay number (labeled on chassis)
|
||||
2. Identify a unique characteristic (size, model, or serial number)
|
||||
3. Match it to the PCI path from the diagnostic output
|
||||
2. Run the main script to see serial numbers
|
||||
3. Match visible serial numbers on drives to the output
|
||||
4. Map PCI paths to bay numbers
|
||||
|
||||
**Pro tip:** If uncertain, remove one drive at a time and re-run the diagnostic to see which PCI path disappears.
|
||||
**Pro tip:** The script shows serial numbers - compare them to visible labels on drive trays to verify physical locations.
|
||||
|
||||
### Step 3: Create Mapping
|
||||
|
||||
@@ -152,30 +183,21 @@ Use debug mode to see the mappings:
|
||||
DEBUG=1 bash driveAtlas.sh
|
||||
```
|
||||
|
||||
## Output Example
|
||||
## Output Columns Explained
|
||||
|
||||
```
|
||||
┌──────────────────────────────────────────────────────────────┐
|
||||
│ compute-storage-01 │
|
||||
│ 10-Bay Hot-swap Chassis │
|
||||
│ │
|
||||
│ M.2 NVMe Slot │
|
||||
│ ┌──────────┐ │
|
||||
│ │ nvme0n1 │ │
|
||||
│ └──────────┘ │
|
||||
│ │
|
||||
│ Front Hot-swap Bays │
|
||||
│ ┌──────────┐┌──────────┐┌──────────┐┌──────────┐... │
|
||||
│ │1: EMPTY ││2: EMPTY ││3: sda ││4: sdb │... │
|
||||
│ └──────────┘└──────────┘└──────────┘└──────────┘... │
|
||||
└──────────────────────────────────────────────────────────────┘
|
||||
|
||||
=== Drive Details with SMART Status ===
|
||||
DEVICE SIZE TYPE TEMP HEALTH MODEL
|
||||
--------------------------------------------------------------------------------
|
||||
/dev/sda 2TB HDD 35°C ✓ WD20EFRX-68EUZN0
|
||||
/dev/nvme0n1 1TB SSD 42°C ✓ Samsung 980 PRO
|
||||
```
|
||||
| Column | Description |
|
||||
|--------|-------------|
|
||||
| **BAY** | Physical bay number (1-10, m2-1, etc.) |
|
||||
| **DEVICE** | Linux device name (/dev/sdX, /dev/nvmeXnY) |
|
||||
| **SIZE** | Drive capacity |
|
||||
| **TYPE** | SSD or HDD (detected via SMART) |
|
||||
| **TEMP** | Current temperature from SMART |
|
||||
| **HEALTH** | SMART health status (✓ = passed, ✗ = failed) |
|
||||
| **MODEL** | Drive model number |
|
||||
| **SERIAL** | Drive serial number (for physical verification) |
|
||||
| **CEPH OSD** | Ceph OSD ID if drive hosts an OSD |
|
||||
| **STATUS** | Ceph OSD status (up/in, down/out, etc.) |
|
||||
| **USAGE** | Mount point or "BOOT" for system drive |
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
@@ -190,7 +212,7 @@ DEVICE SIZE TYPE TEMP HEALTH MODEL
|
||||
- Even identical motherboards can have different PCI addressing
|
||||
- BIOS settings can affect PCI enumeration
|
||||
- HBA installation in different PCIe slots changes addresses
|
||||
- Cable routing to different SATA ports changes the ata-N number
|
||||
- Cable routing to different SATA ports changes the ata-N or phy-N number
|
||||
|
||||
### SMART data not showing
|
||||
|
||||
@@ -199,19 +221,32 @@ DEVICE SIZE TYPE TEMP HEALTH MODEL
|
||||
- USB-connected drives may not support SMART
|
||||
- Run `sudo smartctl -i /dev/sdX` manually to check
|
||||
|
||||
### Ceph OSD status shows "unknown/out"
|
||||
|
||||
- Ensure `ceph` and `ceph-volume` commands are available
|
||||
- Check if the Ceph cluster is healthy: `ceph -s`
|
||||
- Verify OSD is actually up: `ceph osd tree`
|
||||
|
||||
### Serial numbers don't match visible labels
|
||||
|
||||
- Some manufacturers use different serials for SMART vs. physical labels
|
||||
- Cross-reference by drive model and size
|
||||
- Use the removal method: power down, remove drive, check which bay becomes EMPTY
|
||||
|
||||
## Files
|
||||
|
||||
- [driveAtlas.sh](driveAtlas.sh) - Main script
|
||||
- [diagnose-drives.sh](diagnose-drives.sh) - PCI path diagnostic tool
|
||||
- [README.md](README.md) - This file
|
||||
- [todo.txt](todo.txt) - Development notes
|
||||
- [CLAUDE.md](CLAUDE.md) - AI-assisted development notes
|
||||
- [todo.txt](todo.txt) - Development notes and task tracking
|
||||
|
||||
## Contributing
|
||||
|
||||
When adding support for a new server:
|
||||
|
||||
1. Run `diagnose-drives.sh` and save output
|
||||
2. Physically label or identify drives
|
||||
2. Physically label or identify drives by serial number
|
||||
3. Create mapping in `SERVER_MAPPINGS`
|
||||
4. Test thoroughly
|
||||
5. Document any unique hardware configurations
|
||||
@@ -231,11 +266,15 @@ PCI paths are deterministic and based on physical hardware topology.
|
||||
|
||||
### Bay Numbering Conventions
|
||||
|
||||
- **10-bay chassis:** Bays numbered 1-10 (left to right, top to bottom)
|
||||
- **10-bay chassis:** Bays numbered 1-10 (left to right, typically)
|
||||
- **M.2 slots:** Labeled as `m2-1`, `m2-2`, etc.
|
||||
- **USB drives:** Labeled as `usb1`, `usb2`, etc.
|
||||
- **Large1:** Grid numbering 1-9 (3x3 displayed, additional bays documented in mapping)
|
||||
- **Large1:** Grid numbering 1-15 (documented in mapping)
|
||||
|
||||
## License
|
||||
### Ceph Integration
|
||||
|
||||
Internal tool for LotusGuild infrastructure.
|
||||
The script automatically detects Ceph OSDs using:
|
||||
1. `ceph-volume lvm list` to map devices to OSD IDs
|
||||
2. `ceph osd tree` to get up/down and in/out status
|
||||
|
||||
Status format: `up/in` means OSD is running and participating in the cluster.
|
||||
@@ -1,59 +1,59 @@
|
||||
#!/bin/bash
|
||||
|
||||
# Drive Atlas Diagnostic Script
|
||||
# Run this on each server to gather PCI path information
|
||||
|
||||
echo "=== Server Information ==="
|
||||
echo "Hostname: $(hostname)"
|
||||
echo "Date: $(date)"
|
||||
echo ""
|
||||
|
||||
echo "=== All /dev/disk/by-path/ entries ==="
|
||||
ls -la /dev/disk/by-path/ | grep -v "part" | sort
|
||||
echo ""
|
||||
|
||||
echo "=== Organized by PCI Address ==="
|
||||
for path in /dev/disk/by-path/*; do
|
||||
if [ -L "$path" ]; then
|
||||
# Skip partitions
|
||||
if [[ "$path" =~ -part[0-9]+$ ]]; then
|
||||
continue
|
||||
fi
|
||||
|
||||
basename_path=$(basename "$path")
|
||||
target=$(readlink -f "$path")
|
||||
device=$(basename "$target")
|
||||
|
||||
echo "Path: $basename_path"
|
||||
echo " -> Device: $device"
|
||||
|
||||
# Try to get size
|
||||
if [ -b "$target" ]; then
|
||||
size=$(lsblk -d -n -o SIZE "$target" 2>/dev/null)
|
||||
echo " -> Size: $size"
|
||||
fi
|
||||
|
||||
# Try to get SMART info for model
|
||||
if command -v smartctl >/dev/null 2>&1; then
|
||||
model=$(sudo smartctl -i "$target" 2>/dev/null | grep "Device Model\|Model Number" | cut -d: -f2 | xargs)
|
||||
if [ -n "$model" ]; then
|
||||
echo " -> Model: $model"
|
||||
fi
|
||||
fi
|
||||
echo ""
|
||||
fi
|
||||
done
|
||||
|
||||
echo "=== PCI Devices with Storage Controllers ==="
|
||||
lspci | grep -i "storage\|raid\|sata\|sas\|nvme"
|
||||
echo ""
|
||||
|
||||
echo "=== Current Block Devices ==="
|
||||
lsblk -d -o NAME,SIZE,TYPE,TRAN | grep -v "rbd\|loop"
|
||||
echo ""
|
||||
|
||||
echo "=== Recommendations ==="
|
||||
echo "1. Note the PCI addresses (e.g., 0c:00.0) of your storage controllers"
|
||||
echo "2. For each bay, physically identify which drive is in it"
|
||||
echo "3. Match the PCI path pattern to the bay number"
|
||||
echo "4. Example: pci-0000:0c:00.0-ata-1 might be bay 1 on controller 0c:00.0"
|
||||
#!/bin/bash
|
||||
|
||||
# Drive Atlas Diagnostic Script
|
||||
# Run this on each server to gather PCI path information
|
||||
|
||||
echo "=== Server Information ==="
|
||||
echo "Hostname: $(hostname)"
|
||||
echo "Date: $(date)"
|
||||
echo ""
|
||||
|
||||
echo "=== All /dev/disk/by-path/ entries ==="
|
||||
ls -la /dev/disk/by-path/ | grep -v "part" | sort
|
||||
echo ""
|
||||
|
||||
echo "=== Organized by PCI Address ==="
|
||||
for path in /dev/disk/by-path/*; do
|
||||
if [ -L "$path" ]; then
|
||||
# Skip partitions
|
||||
if [[ "$path" =~ -part[0-9]+$ ]]; then
|
||||
continue
|
||||
fi
|
||||
|
||||
basename_path=$(basename "$path")
|
||||
target=$(readlink -f "$path")
|
||||
device=$(basename "$target")
|
||||
|
||||
echo "Path: $basename_path"
|
||||
echo " -> Device: $device"
|
||||
|
||||
# Try to get size
|
||||
if [ -b "$target" ]; then
|
||||
size=$(lsblk -d -n -o SIZE "$target" 2>/dev/null)
|
||||
echo " -> Size: $size"
|
||||
fi
|
||||
|
||||
# Try to get SMART info for model
|
||||
if command -v smartctl >/dev/null 2>&1; then
|
||||
model=$(sudo smartctl -i "$target" 2>/dev/null | grep "Device Model\|Model Number" | cut -d: -f2 | xargs)
|
||||
if [ -n "$model" ]; then
|
||||
echo " -> Model: $model"
|
||||
fi
|
||||
fi
|
||||
echo ""
|
||||
fi
|
||||
done
|
||||
|
||||
echo "=== PCI Devices with Storage Controllers ==="
|
||||
lspci | grep -i "storage\|raid\|sata\|sas\|nvme"
|
||||
echo ""
|
||||
|
||||
echo "=== Current Block Devices ==="
|
||||
lsblk -d -o NAME,SIZE,TYPE,TRAN | grep -v "rbd\|loop"
|
||||
echo ""
|
||||
|
||||
echo "=== Recommendations ==="
|
||||
echo "1. Note the PCI addresses (e.g., 0c:00.0) of your storage controllers"
|
||||
echo "2. For each bay, physically identify which drive is in it"
|
||||
echo "3. Match the PCI path pattern to the bay number"
|
||||
echo "4. Example: pci-0000:0c:00.0-ata-1 might be bay 1 on controller 0c:00.0"
|
||||
|
||||
370
driveAtlas.sh
370
driveAtlas.sh
@@ -14,98 +14,119 @@ generate_10bay_layout() {
|
||||
local hostname=$1
|
||||
build_drive_map
|
||||
|
||||
# Calculate max width needed for drive names
|
||||
max_width=0
|
||||
for bay in {1..10} "m2-1" "usb1" "usb2"; do
|
||||
drive_text="${DRIVE_MAP[$bay]:-EMPTY}"
|
||||
text_len=$((${#bay} + 1 + ${#drive_text}))
|
||||
[[ $text_len -gt $max_width ]] && max_width=$text_len
|
||||
done
|
||||
|
||||
# Add padding for box borders
|
||||
box_width=$((max_width + 4))
|
||||
|
||||
# Create box drawing elements
|
||||
h_line=$(printf '%*s' "$box_width" '' | tr ' ' '─')
|
||||
|
||||
# USB Section (if applicable)
|
||||
if [[ -n "${DRIVE_MAP[usb1]}" || -n "${DRIVE_MAP[usb2]}" ]]; then
|
||||
printf "\n External USB\n"
|
||||
printf " ┌%s┐ ┌%s┐\n" "$h_line" "$h_line"
|
||||
printf " │ %-${max_width}s │ │ %-${max_width}s │\n" "${DRIVE_MAP[usb1]:-EMPTY}" "${DRIVE_MAP[usb2]:-EMPTY}"
|
||||
printf " └%s┘ └%s┘\n\n" "$h_line" "$h_line"
|
||||
fi
|
||||
# Fixed width for consistent box drawing (fits device names like "nvme0n1")
|
||||
local drive_width=10
|
||||
|
||||
# Main chassis section
|
||||
printf "┌──────────────────────────────────────────────────────────────┐\n"
|
||||
printf "│ %-58s │\n" "$hostname"
|
||||
printf "│ %-58s │\n" "10-Bay Hot-swap Chassis"
|
||||
printf "│ │\n"
|
||||
printf "┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐\n"
|
||||
printf "│ %-126s │\n" "$hostname - Sliger CX4712 (10x 3.5\" Hot-swap)"
|
||||
printf "│ │\n"
|
||||
|
||||
# Show storage controllers
|
||||
printf "│ Storage Controllers: │\n"
|
||||
while IFS= read -r ctrl; do
|
||||
[[ -n "$ctrl" ]] && printf "│ %-126s│\n" "$ctrl"
|
||||
done < <(get_storage_controllers)
|
||||
printf "│ │\n"
|
||||
|
||||
# M.2 NVMe slot if present
|
||||
if [[ -n "${DRIVE_MAP[m2-1]}" ]]; then
|
||||
printf "│ M.2 NVMe Slot │\n"
|
||||
printf "│ ┌%s┐ │\n" "$h_line"
|
||||
printf "│ │ %-${max_width}s │ │\n" "${DRIVE_MAP[m2-1]:-EMPTY}"
|
||||
printf "│ └%s┘ │\n" "$h_line"
|
||||
printf "│ │\n"
|
||||
printf "│ M.2 NVMe: %-10s │\n" "${DRIVE_MAP[m2-1]}"
|
||||
printf "│ │\n"
|
||||
fi
|
||||
|
||||
printf "│ Front Hot-swap Bays │\n"
|
||||
printf "│ Front Hot-swap Bays: │\n"
|
||||
printf "│ │\n"
|
||||
|
||||
# Create bay rows
|
||||
printf "│ "
|
||||
# Bay top borders
|
||||
printf "│ "
|
||||
for bay in {1..10}; do
|
||||
printf "┌%s┐" "$h_line"
|
||||
printf "┌──────────┐ "
|
||||
done
|
||||
printf " │\n│ "
|
||||
printf " │\n"
|
||||
|
||||
# Bay contents
|
||||
printf "│ "
|
||||
for bay in {1..10}; do
|
||||
printf "│%-2d:%-${max_width}s │" "$bay" "${DRIVE_MAP[$bay]:-EMPTY}"
|
||||
printf "│%-2d:%-7s│ " "$bay" "${DRIVE_MAP[$bay]:-EMPTY}"
|
||||
done
|
||||
printf " │\n│ "
|
||||
printf " │\n"
|
||||
|
||||
# Bay bottom borders
|
||||
printf "│ "
|
||||
for bay in {1..10}; do
|
||||
printf "└%s┘" "$h_line"
|
||||
printf "└──────────┘ "
|
||||
done
|
||||
printf " │\n"
|
||||
printf " │\n"
|
||||
|
||||
printf "└──────────────────────────────────────────────────────────────┘\n"
|
||||
printf "└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘\n"
|
||||
}
|
||||
|
||||
generate_micro_layout() {
|
||||
local hostname=$1
|
||||
build_drive_map
|
||||
|
||||
# Check for eMMC storage
|
||||
local emmc_device=""
|
||||
if [[ -b /dev/mmcblk0 ]]; then
|
||||
emmc_device="mmcblk0"
|
||||
fi
|
||||
|
||||
printf "┌─────────────────────────────────────────────────────────────┐\n"
|
||||
printf "│ %-57s │\n" "$hostname - Micro SBC"
|
||||
printf "│ │\n"
|
||||
printf "│ Storage Controllers: │\n"
|
||||
while IFS= read -r ctrl; do
|
||||
[[ -n "$ctrl" ]] && printf "│ %-57s│\n" "$ctrl"
|
||||
done < <(get_storage_controllers)
|
||||
printf "│ │\n"
|
||||
|
||||
# Show eMMC if present
|
||||
if [[ -n "$emmc_device" ]]; then
|
||||
local emmc_size=$(lsblk -d -n -o SIZE "/dev/$emmc_device" 2>/dev/null | xargs)
|
||||
printf "│ ┌─────────────────────────────────────────────────────┐ │\n"
|
||||
printf "│ │ Onboard eMMC: %-10s (%s) │ │\n" "$emmc_device" "$emmc_size"
|
||||
printf "│ └─────────────────────────────────────────────────────┘ │\n"
|
||||
printf "│ │\n"
|
||||
fi
|
||||
|
||||
printf "│ SATA Ports (rear): │\n"
|
||||
printf "│ ┌──────────────┐ ┌──────────────┐ │\n"
|
||||
printf "│ │ 1: %-9s │ │ 2: %-9s │ │\n" "${DRIVE_MAP[1]:-EMPTY}" "${DRIVE_MAP[2]:-EMPTY}"
|
||||
printf "│ └──────────────┘ └──────────────┘ │\n"
|
||||
printf "└─────────────────────────────────────────────────────────────┘\n"
|
||||
}
|
||||
|
||||
generate_large1_layout() {
|
||||
local hostname=$1
|
||||
build_drive_map
|
||||
|
||||
cat << 'EOF'
|
||||
┌─────────────────────────────────────────────────────────────┐
|
||||
│ large1 │
|
||||
│ Unique 3x5 Grid Chassis │
|
||||
│ │
|
||||
│ ┌──────────────────────────────────────────────┐ │
|
||||
│ │ Motherboard │ │
|
||||
│ │ │ │
|
||||
│ │ ┌──┐┌──┐ │ │
|
||||
│ │ │M1││M2│ │ │
|
||||
│ │ └──┘└──┘ │ │
|
||||
│ └──────────────────────────────────────────────┘ │
|
||||
│ │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ 1 │ │ 2 │ │ 3 │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ 4 │ │ 5 │ │ 6 │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
|
||||
│ ┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ │ 7 │ │ 8 │ │ 9 │ │
|
||||
│ │ │ │ │ │ │ │
|
||||
│ └─────────────────┘ └─────────────────┘ └─────────────────┘ │
|
||||
└─────────────────────────────────────────────────────────────┘
|
||||
EOF
|
||||
# large1 has 3 stacks of 5 bays at front (15 total) + 2 M.2 slots
|
||||
# Physical bay mapping TBD - current mapping is by controller order
|
||||
printf "┌─────────────────────────────────────────────────────────────────────────┐\n"
|
||||
printf "│ %-69s │\n" "$hostname - Rosewill RSV-L4500U (15x 3.5\" Bays)"
|
||||
printf "│ │\n"
|
||||
printf "│ Storage Controllers: │\n"
|
||||
while IFS= read -r ctrl; do
|
||||
[[ -n "$ctrl" ]] && printf "│ %-69s│\n" "$ctrl"
|
||||
done < <(get_storage_controllers)
|
||||
printf "│ │\n"
|
||||
printf "│ M.2 NVMe: M1: %-10s M2: %-10s │\n" "${DRIVE_MAP[m2-1]:-EMPTY}" "${DRIVE_MAP[m2-2]:-EMPTY}"
|
||||
printf "│ │\n"
|
||||
printf "│ Front Bays (3 stacks x 5 rows): [Bay mapping TBD] │\n"
|
||||
printf "│ Stack A Stack B Stack C │\n"
|
||||
printf "│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │\n"
|
||||
printf "│ │1:%-8s│ │2:%-8s│ │3:%-8s│ │\n" "${DRIVE_MAP[1]:-EMPTY}" "${DRIVE_MAP[2]:-EMPTY}" "${DRIVE_MAP[3]:-EMPTY}"
|
||||
printf "│ ├──────────┤ ├──────────┤ ├──────────┤ │\n"
|
||||
printf "│ │4:%-8s│ │5:%-8s│ │6:%-8s│ │\n" "${DRIVE_MAP[4]:-EMPTY}" "${DRIVE_MAP[5]:-EMPTY}" "${DRIVE_MAP[6]:-EMPTY}"
|
||||
printf "│ ├──────────┤ ├──────────┤ ├──────────┤ │\n"
|
||||
printf "│ │7:%-8s│ │8:%-8s│ │9:%-8s│ │\n" "${DRIVE_MAP[7]:-EMPTY}" "${DRIVE_MAP[8]:-EMPTY}" "${DRIVE_MAP[9]:-EMPTY}"
|
||||
printf "│ ├──────────┤ ├──────────┤ ├──────────┤ │\n"
|
||||
printf "│ │10:%-7s│ │11:%-7s│ │12:%-7s│ │\n" "${DRIVE_MAP[10]:-EMPTY}" "${DRIVE_MAP[11]:-EMPTY}" "${DRIVE_MAP[12]:-EMPTY}"
|
||||
printf "│ ├──────────┤ ├──────────┤ ├──────────┤ │\n"
|
||||
printf "│ │13:%-7s│ │14:%-7s│ │15:%-7s│ │\n" "${DRIVE_MAP[13]:-EMPTY}" "${DRIVE_MAP[14]:-EMPTY}" "${DRIVE_MAP[15]:-EMPTY}"
|
||||
printf "│ └──────────┘ └──────────┘ └──────────┘ │\n"
|
||||
printf "└─────────────────────────────────────────────────────────────────────────┘\n"
|
||||
}
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
@@ -116,29 +137,86 @@ EOF
|
||||
|
||||
declare -A SERVER_MAPPINGS=(
|
||||
# compute-storage-01 (formerly medium2)
|
||||
# Motherboard: B650D4U3-2Q/BCM
|
||||
# Controller at 0c:00.0 for hot-swap bays
|
||||
# Controller at 0d:00.0 for M.2 NVMe
|
||||
# Motherboard: B650D4U3-2Q/BCM with AMD SATA controller
|
||||
# HBA: LSI SAS3008 at 01:00.0 (mini-SAS HD ports)
|
||||
# Cable mapping from user notes:
|
||||
# - Mobo SATA: top-right=bay1, bottom-right=bay2, bottom-left=bay3, top-left=bay4
|
||||
# - HBA bottom mini-SAS: bays 5,6,7,8
|
||||
# - HBA top mini-SAS: bays 9,10
|
||||
["compute-storage-01"]="
|
||||
pci-0000:0c:00.0-ata-3 5
|
||||
pci-0000:0c:00.0-ata-4 6
|
||||
pci-0000:0c:00.0-ata-1 3
|
||||
pci-0000:0c:00.0-ata-2 4
|
||||
pci-0000:0d:00.0-nvme-1 m2-1
|
||||
pci-0000:0b:00.0-usb-0:3:1.0-scsi-0:0:0:0 usb1
|
||||
pci-0000:0b:00.0-usb-0:4:1.0-scsi-0:0:0:0 usb2
|
||||
pci-0000:0d:00.0-ata-2 1
|
||||
pci-0000:0d:00.0-ata-1 2
|
||||
pci-0000:0d:00.0-ata-3 3
|
||||
pci-0000:0d:00.0-ata-4 4
|
||||
pci-0000:01:00.0-sas-phy6-lun-0 5
|
||||
pci-0000:01:00.0-sas-phy7-lun-0 6
|
||||
pci-0000:01:00.0-sas-phy5-lun-0 7
|
||||
pci-0000:01:00.0-sas-phy2-lun-0 8
|
||||
pci-0000:01:00.0-sas-phy4-lun-0 9
|
||||
pci-0000:01:00.0-sas-phy3-lun-0 10
|
||||
pci-0000:0e:00.0-nvme-1 m2-1
|
||||
"
|
||||
|
||||
# compute-storage-gpu-01
|
||||
# Motherboard: ASUS PRIME B550-PLUS with AMD SATA controller at 02:00.1
|
||||
# 5 SATA ports + 1 M.2 NVMe slot
|
||||
# sdf is USB/card reader - not mapped
|
||||
["compute-storage-gpu-01"]="
|
||||
pci-0000:02:00.1-ata-1 1
|
||||
pci-0000:02:00.1-ata-2 2
|
||||
pci-0000:02:00.1-ata-3 3
|
||||
pci-0000:02:00.1-ata-4 4
|
||||
pci-0000:02:00.1-ata-5 5
|
||||
pci-0000:0c:00.0-nvme-1 m2-1
|
||||
"
|
||||
|
||||
# storage-01
|
||||
# Different motherboard, no HBA currently
|
||||
# TODO: Map actual PCI paths after running diagnose-drives.sh
|
||||
# Motherboard: ASRock A320M-HDV R4.0 with AMD SATA controller at 02:00.1
|
||||
# 4 SATA ports used (ata-1, ata-2, ata-5, ata-6) - ata-3/4 empty
|
||||
["storage-01"]="
|
||||
pci-0000:02:00.1-ata-1 1
|
||||
pci-0000:02:00.1-ata-2 2
|
||||
pci-0000:02:00.1-ata-5 3
|
||||
pci-0000:02:00.1-ata-6 4
|
||||
"
|
||||
|
||||
# large1
|
||||
# Unique chassis - 1/1 configuration
|
||||
# TODO: Map actual PCI paths after running diagnose-drives.sh
|
||||
# Custom tower with multiple controllers:
|
||||
# - HBA: LSI SAS2008 at 10:00.0 (7 drives)
|
||||
# - AMD SATA at 16:00.1 (3 drives)
|
||||
# - ASMedia SATA at 25:00.0 (2 drives)
|
||||
# - 2x NVMe slots
|
||||
["large1"]="
|
||||
pci-0000:10:00.0-sas-phy0-lun-0 1
|
||||
pci-0000:10:00.0-sas-phy1-lun-0 2
|
||||
pci-0000:10:00.0-sas-phy3-lun-0 3
|
||||
pci-0000:10:00.0-sas-phy4-lun-0 4
|
||||
pci-0000:10:00.0-sas-phy5-lun-0 5
|
||||
pci-0000:10:00.0-sas-phy6-lun-0 6
|
||||
pci-0000:10:00.0-sas-phy7-lun-0 7
|
||||
pci-0000:16:00.1-ata-3 8
|
||||
pci-0000:16:00.1-ata-7 9
|
||||
pci-0000:16:00.1-ata-8 10
|
||||
pci-0000:25:00.0-ata-1 11
|
||||
pci-0000:25:00.0-ata-2 12
|
||||
pci-0000:2a:00.0-nvme-1 m2-1
|
||||
pci-0000:26:00.0-nvme-1 m2-2
|
||||
"
|
||||
|
||||
# micro1
|
||||
# ZimaBoard 832 - Single board computer
|
||||
# 2 SATA ports on rear (currently unused)
|
||||
# Boot from onboard eMMC (mmcblk0)
|
||||
# SATA controller at 00:12.0
|
||||
["micro1"]="
|
||||
"
|
||||
|
||||
# monitor-02
|
||||
# ZimaBoard 832 - Single board computer
|
||||
# 2 SATA ports on rear (currently unused)
|
||||
# Boot from onboard eMMC (mmcblk0)
|
||||
# SATA controller would be at a specific PCI address when drives connected
|
||||
["monitor-02"]="
|
||||
"
|
||||
)
|
||||
|
||||
@@ -147,14 +225,24 @@ declare -A CHASSIS_TYPES=(
|
||||
["compute-storage-gpu-01"]="10bay"
|
||||
["storage-01"]="10bay"
|
||||
["large1"]="large1"
|
||||
["micro1"]="micro"
|
||||
["monitor-02"]="micro"
|
||||
["micro1"]="micro" # ZimaBoard 832
|
||||
["monitor-02"]="micro" # ZimaBoard 832
|
||||
)
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
# Core Functions
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
get_storage_controllers() {
|
||||
# Returns a formatted list of storage controllers (HBAs, SATA, NVMe)
|
||||
lspci 2>/dev/null | grep -iE "SAS|SATA|RAID|Mass storage|NVMe" | while read -r line; do
|
||||
pci_addr=$(echo "$line" | awk '{print $1}')
|
||||
# Get short description (strip PCI address)
|
||||
desc=$(echo "$line" | sed 's/^[0-9a-f:.]\+ //')
|
||||
echo " $pci_addr: $desc"
|
||||
done
|
||||
}
|
||||
|
||||
build_drive_map() {
|
||||
local host=$(hostname)
|
||||
declare -A drive_map
|
||||
@@ -186,8 +274,9 @@ get_drive_smart_info() {
|
||||
local type=$(echo "$smart_info" | grep "Rotation Rate" | grep -q "Solid State" && echo "SSD" || echo "HDD")
|
||||
local health=$(echo "$smart_info" | grep "SMART overall-health" | grep -q "PASSED" && echo "✓" || echo "✗")
|
||||
local model=$(echo "$smart_info" | grep "Device Model\|Model Number" | cut -d: -f2 | xargs)
|
||||
local serial=$(echo "$smart_info" | grep "Serial Number" | awk '{print $3}')
|
||||
|
||||
echo "$type|$temp°C|$health|$model"
|
||||
echo "$type|$temp°C|$health|$model|$serial"
|
||||
}
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
@@ -203,10 +292,10 @@ case "$CHASSIS_TYPE" in
|
||||
generate_10bay_layout "$HOSTNAME"
|
||||
;;
|
||||
"large1")
|
||||
generate_large1_layout
|
||||
generate_large1_layout "$HOSTNAME"
|
||||
;;
|
||||
"micro")
|
||||
echo "Micro server layout not yet implemented"
|
||||
generate_micro_layout "$HOSTNAME"
|
||||
;;
|
||||
*)
|
||||
echo "┌─────────────────────────────────────────────────────────┐"
|
||||
@@ -221,29 +310,87 @@ esac
|
||||
# Drive Details Section
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
echo -e "\n=== Drive Details with SMART Status ==="
|
||||
printf "%-15s %-10s %-8s %-8s %-8s %-30s\n" "DEVICE" "SIZE" "TYPE" "TEMP" "HEALTH" "MODEL"
|
||||
echo "--------------------------------------------------------------------------------"
|
||||
echo -e "\n=== Drive Details with SMART Status (by Bay Position) ==="
|
||||
printf "%-5s %-15s %-10s %-8s %-8s %-8s %-30s %-20s %-12s %-10s %-10s\n" "BAY" "DEVICE" "SIZE" "TYPE" "TEMP" "HEALTH" "MODEL" "SERIAL" "CEPH OSD" "STATUS" "USAGE"
|
||||
echo "----------------------------------------------------------------------------------------------------------------------------------------------------"
|
||||
|
||||
# SATA/SAS drives
|
||||
lsblk -d -o NAME | grep -v "nvme" | grep -v "rbd" | grep -v "loop" | grep -v "NAME" | while read device; do
|
||||
if [ -b "/dev/$device" ]; then
|
||||
# Build reverse map: device -> bay
|
||||
declare -A DEVICE_TO_BAY
|
||||
for bay in "${!DRIVE_MAP[@]}"; do
|
||||
device="${DRIVE_MAP[$bay]}"
|
||||
if [[ -n "$device" && "$device" != "EMPTY" ]]; then
|
||||
DEVICE_TO_BAY[$device]=$bay
|
||||
fi
|
||||
done
|
||||
|
||||
# Sort drives by bay position
|
||||
for bay in $(printf '%s\n' "${!DRIVE_MAP[@]}" | grep -E '^[0-9]+$' | sort -n); do
|
||||
device="${DRIVE_MAP[$bay]}"
|
||||
if [[ -n "$device" && "$device" != "EMPTY" && -b "/dev/$device" ]]; then
|
||||
size=$(lsblk -d -n -o SIZE "/dev/$device" 2>/dev/null)
|
||||
smart_info=$(get_drive_smart_info "$device")
|
||||
IFS='|' read -r type temp health model <<< "$smart_info"
|
||||
printf "%-15s %-10s %-8s %-8s %-8s %-30s\n" "/dev/$device" "$size" "$type" "$temp" "$health" "$model"
|
||||
IFS='|' read -r type temp health model serial <<< "$smart_info"
|
||||
|
||||
# Check for Ceph OSD
|
||||
osd_id=$(ceph-volume lvm list 2>/dev/null | grep -B 20 "/dev/$device" | grep "osd id" | awk '{print "osd."$3}' | head -1)
|
||||
|
||||
# Get Ceph status if OSD exists
|
||||
ceph_status="-"
|
||||
if [[ -n "$osd_id" ]]; then
|
||||
# Get in/out and up/down status from ceph osd tree
|
||||
osd_num=$(echo "$osd_id" | sed 's/osd\.//')
|
||||
# Parse ceph osd tree output - column 5 is STATUS (up/down), column 6 is REWEIGHT (1.0 = in, 0 = out)
|
||||
tree_line=$(ceph osd tree 2>/dev/null | grep -E "^\s*${osd_num}\s+" | grep "osd.${osd_num}")
|
||||
up_status=$(echo "$tree_line" | awk '{print $5}')
|
||||
reweight=$(echo "$tree_line" | awk '{print $6}')
|
||||
|
||||
# Default to unknown if we can't parse
|
||||
[[ -z "$up_status" ]] && up_status="unknown"
|
||||
[[ -z "$reweight" ]] && reweight="0"
|
||||
|
||||
# Determine in/out based on reweight (1.0 = in, 0 = out)
|
||||
if (( $(echo "$reweight > 0" | bc -l 2>/dev/null || echo 0) )); then
|
||||
in_status="in"
|
||||
else
|
||||
in_status="out"
|
||||
fi
|
||||
|
||||
ceph_status="${up_status}/${in_status}"
|
||||
else
|
||||
osd_id="-"
|
||||
fi
|
||||
|
||||
# Check if boot drive
|
||||
usage="-"
|
||||
if mount | grep -q "^/dev/${device}"; then
|
||||
mount_point=$(mount | grep "^/dev/${device}" | awk '{print $3}' | head -1)
|
||||
if [[ "$mount_point" == "/" ]]; then
|
||||
usage="BOOT"
|
||||
else
|
||||
usage="$mount_point"
|
||||
fi
|
||||
fi
|
||||
|
||||
printf "%-5s %-15s %-10s %-8s %-8s %-8s %-30s %-20s %-12s %-10s %-10s\n" "$bay" "/dev/$device" "$size" "$type" "$temp" "$health" "$model" "$serial" "$osd_id" "$ceph_status" "$usage"
|
||||
fi
|
||||
done
|
||||
|
||||
# NVMe drives
|
||||
if command -v nvme >/dev/null 2>&1; then
|
||||
nvme_drives=$(sudo nvme list 2>/dev/null | grep "^/dev")
|
||||
if [ -n "$nvme_drives" ]; then
|
||||
echo -e "\n=== NVMe Drives ==="
|
||||
printf "%-15s %-10s %-10s %-40s\n" "DEVICE" "SIZE" "TYPE" "MODEL"
|
||||
echo "--------------------------------------------------------------------------------"
|
||||
echo "$nvme_drives" | awk '{printf "%-15s %-10s %-10s %-40s\n", $1, $6, "NVMe", $3}'
|
||||
fi
|
||||
nvme_devices=$(lsblk -d -n -o NAME,SIZE | grep "^nvme" 2>/dev/null)
|
||||
if [ -n "$nvme_devices" ]; then
|
||||
echo -e "\n=== NVMe Drives ==="
|
||||
printf "%-15s %-10s %-10s %-40s %-25s\n" "DEVICE" "SIZE" "TYPE" "MODEL" "SERIAL"
|
||||
echo "------------------------------------------------------------------------------------------------------"
|
||||
echo "$nvme_devices" | while read -r name size; do
|
||||
device="/dev/$name"
|
||||
# Get model and serial from smartctl for accuracy
|
||||
smart_info=$(sudo smartctl -i "$device" 2>/dev/null)
|
||||
model=$(echo "$smart_info" | grep "Model Number" | cut -d: -f2 | xargs)
|
||||
serial=$(echo "$smart_info" | grep "Serial Number" | cut -d: -f2 | xargs)
|
||||
[[ -z "$model" ]] && model="-"
|
||||
[[ -z "$serial" ]] && serial="-"
|
||||
printf "%-15s %-10s %-10s %-40s %-25s\n" "$device" "$size" "NVMe" "$model" "$serial"
|
||||
done
|
||||
fi
|
||||
|
||||
#------------------------------------------------------------------------------
|
||||
@@ -251,12 +398,17 @@ fi
|
||||
#------------------------------------------------------------------------------
|
||||
|
||||
# Ceph RBD Devices
|
||||
rbd_output=$(lsblk -o NAME,SIZE,TYPE,MOUNTPOINT 2>/dev/null | grep "rbd" | sort -V)
|
||||
if [ -n "$rbd_output" ]; then
|
||||
rbd_devices=$(lsblk -d -n -o NAME,SIZE,TYPE 2>/dev/null | grep "rbd" | sort -V)
|
||||
if [ -n "$rbd_devices" ]; then
|
||||
echo -e "\n=== Ceph RBD Devices ==="
|
||||
printf "%-15s %-10s %-10s %-20s\n" "DEVICE" "SIZE" "TYPE" "MOUNTPOINT"
|
||||
printf "%-15s %-10s %-10s %-30s\n" "DEVICE" "SIZE" "TYPE" "MOUNTPOINT"
|
||||
echo "------------------------------------------------------------"
|
||||
echo "$rbd_output"
|
||||
echo "$rbd_devices" | while read -r name size type; do
|
||||
# Get mountpoint if any
|
||||
mountpoint=$(lsblk -n -o MOUNTPOINT "/dev/$name" 2>/dev/null | head -1)
|
||||
[[ -z "$mountpoint" ]] && mountpoint="-"
|
||||
printf "%-15s %-10s %-10s %-30s\n" "/dev/$name" "$size" "$type" "$mountpoint"
|
||||
done
|
||||
fi
|
||||
|
||||
# Show mapping diagnostic info if DEBUG is set
|
||||
|
||||
11
get-serials.sh
Normal file
11
get-serials.sh
Normal file
@@ -0,0 +1,11 @@
|
||||
#!/bin/bash
|
||||
|
||||
echo "=== Drive Serial Numbers ==="
|
||||
for dev in sd{a..j}; do
|
||||
if [ -b "/dev/$dev" ]; then
|
||||
serial=$(sudo smartctl -i /dev/$dev 2>/dev/null | grep "Serial Number" | awk '{print $3}')
|
||||
model=$(sudo smartctl -i /dev/$dev 2>/dev/null | grep "Device Model\|Model Number" | cut -d: -f2 | xargs)
|
||||
size=$(lsblk -d -n -o SIZE /dev/$dev 2>/dev/null)
|
||||
echo "/dev/$dev: $serial ($size - $model)"
|
||||
fi
|
||||
done
|
||||
11
test-paths.sh
Normal file
11
test-paths.sh
Normal file
@@ -0,0 +1,11 @@
|
||||
#!/bin/bash
|
||||
|
||||
echo "=== Checking /dev/disk/by-path/ ==="
|
||||
ls -la /dev/disk/by-path/ | grep -v "part" | grep "pci-0000:0c:00.0" | head -20
|
||||
echo ""
|
||||
echo "=== Checking if paths exist from mapping ==="
|
||||
echo "pci-0000:0c:00.0-ata-3:"
|
||||
ls -la /dev/disk/by-path/pci-0000:0c:00.0-ata-3 2>&1
|
||||
|
||||
echo "pci-0000:0c:00.0-ata-1:"
|
||||
ls -la /dev/disk/by-path/pci-0000:0c:00.0-ata-1 2>&1
|
||||
Reference in New Issue
Block a user