7.6 KiB
AI-Assisted Development Notes
This document chronicles the development of Drive Atlas with assistance from Claude (Anthropic's AI assistant).
Project Overview
Drive Atlas started as a simple bash script with hardcoded drive mappings and evolved into a comprehensive storage infrastructure management tool through iterative development and user feedback.
Development Session
Date: January 6, 2026 AI Model: Claude Sonnet 4.5 Developer: LotusGuild Session Duration: ~2 hours
Initial State
The project began with:
- Basic ASCII art layouts for different server chassis
- Hardcoded drive mappings for "medium2" server
- Simple SMART data display
- Broken PCI path mappings (referenced non-existent hardware)
- Windows line endings causing script execution failures
Evolution Through Collaboration
Phase 1: Architecture Refactoring
Problem: Chassis layouts were tied to hostnames, making it hard to reuse templates.
Solution:
- Separated chassis types from server hostnames
- Created reusable layout generator functions
- Introduced
CHASSIS_TYPESandSERVER_MAPPINGSarrays - Renamed "medium2" → "compute-storage-01" for clarity
Phase 2: Hardware Discovery
Problem: Script referenced PCI controller 0c:00.0 which didn't exist.
Approach:
- Created diagnostic script to probe actual hardware
- Discovered real configuration:
- LSI SAS3008 HBA at
01:00.0(bays 5-10) - AMD SATA controller at
0d:00.0(bays 1-4) - NVMe at
0e:00.0(M.2 slot)
- LSI SAS3008 HBA at
- User provided physical bay labels and visible serial numbers
- Iteratively refined PCI PHY to bay mappings
Key Insight: User confirmed bay 1 contained the SSD boot drive, which helped establish the correct mapping starting point.
Phase 3: Physical Verification
Problem: Needed to verify drive-to-bay mappings without powering down production server.
Solution:
- Added serial number display to script output
- User physically inspected visible serial numbers on drive bays
- Cross-referenced SMART serials with visible labels
- Corrected HBA PHY mappings:
- Bay 5: phy6 (not phy2)
- Bay 6: phy7 (not phy3)
- Bay 7: phy5 (not phy4)
- Bay 8: phy2 (not phy5)
- Bay 9: phy4 (not phy6)
- Bay 10: phy3 (not phy7)
Phase 4: User Experience Improvements
ASCII Art Rendering:
- Initial version had variable-width boxes that broke alignment
- Fixed by using consistent 10-character wide bay boxes
- Multiple iterations to perfect right border alignment
Drive Table Enhancements:
- Original: Alphabetical by device name
- Improved: Sorted by physical bay position (1-10)
- Added BAY column to show physical location
- Wider columns to prevent text wrapping
Phase 5: Ceph Integration
User Request: "Can we show ceph in/up out/down status in the table?"
Implementation:
- Added CEPH OSD column using
ceph-volume lvm list - Added STATUS column parsing
ceph osd tree - Initial bug: Parsed wrong columns (5 & 6 instead of correct ones)
- Fixed by understanding
ceph osd treeformat:- Column 5: STATUS (up/down)
- Column 6: REWEIGHT (1.0 = in, 0 = out)
User Request: "Show which is the boot drive somehow?"
Solution:
- Added USAGE column
- Checks mount points
- Shows "BOOT" for root filesystem
- Shows mount point for other mounts
- Shows "-" for Ceph OSDs (using LVM)
Technical Challenges Solved
1. Line Ending Issues
- Problem:
diagnose-drives.shhad CRLF endings → script failures - Solution:
sed -i 's/\r$//'to convert to LF
2. PCI Path Pattern Matching
- Problem: Bash regex escaping for grep patterns
- Solution:
grep -E "^\s*${osd_num}\s+"for reliable matching
3. Floating Point Comparison in Bash
- Problem: Bash doesn't natively support decimal comparisons
- Solution: Used
bc -lwith error handling:$(echo "$reweight > 0" | bc -l 2>/dev/null || echo 0)
4. Associative Array Sorting
- Problem: Bash associative arrays don't maintain insertion order
- Solution: Extract keys, filter numeric ones, pipe to
sort -n
Key Learning Moments
-
Hardware Reality vs. Assumptions: The original script assumed controller addresses that didn't exist. Always probe actual hardware.
-
Physical Verification is Essential: Serial numbers visible on drive trays were crucial for verifying correct mappings.
-
Iterative Refinement: The script went through 15+ commits, each improving a specific aspect based on user testing and feedback.
-
User-Driven Feature Evolution: Features like Ceph integration and boot drive detection emerged organically from user needs.
Commits Timeline
- Initial refactoring and architecture improvements
- Fixed PCI path mappings based on discovered hardware
- Added serial numbers for physical verification
- Fixed ASCII art rendering issues
- Corrected bay mappings based on user verification
- Added bay-sorted output
- Implemented Ceph OSD tracking
- Added Ceph up/in status
- Added boot drive detection
- Fixed Ceph status parsing
- Documentation updates
Collaborative Techniques Used
Information Gathering
- Asked clarifying questions about hardware configuration
- Requested diagnostic command output
- Had user physically verify drive locations
Iterative Development
- Made small, testable changes
- User tested after each significant change
- Incorporated feedback immediately
Problem-Solving Approach
- Understand current state
- Identify specific issues
- Propose solution
- Implement incrementally
- Test and verify
- Refine based on feedback
Metrics
- Lines of Code: ~330 (main script)
- Supported Chassis Types: 4 (10-bay, large1, micro, spare)
- Mapped Servers: 1 fully (compute-storage-01), 3 pending
- Features Added: 10+
- Bugs Fixed: 6 major, multiple minor
- Documentation: Comprehensive README + this file
Future Enhancements
Potential improvements identified during development:
- Auto-detection: Attempt to auto-map bays by testing with
hdparmLED control - Color Output: Use terminal colors for health status (green/red)
- Historical Tracking: Log temperature trends over time
- Alert Integration: Notify when drive health deteriorates
- Web Interface: Display chassis map in a web dashboard
- Multi-server View: Show all servers in one consolidated view
Lessons for Future AI-Assisted Development
What Worked Well
- Breaking complex problems into small, testable pieces
- Using diagnostic scripts to understand actual vs. assumed state
- Physical verification before trusting software output
- Comprehensive documentation alongside code
- Git commits with detailed messages for traceability
What Could Be Improved
- Earlier physical verification would have saved iteration
- More upfront hardware documentation would help
- Automated testing for bay mappings (if possible)
Conclusion
This project demonstrates effective human-AI collaboration where:
- The AI provided technical implementation and problem-solving
- The human provided domain knowledge, testing, and verification
- Iterative feedback loops led to a polished, production-ready tool
The result is a robust infrastructure management tool that provides instant visibility into complex storage configurations across multiple servers.
Development Credits:
- Human Developer: LotusGuild
- AI Assistant: Claude Sonnet 4.5 (Anthropic)
- Development Date: January 6, 2026
- Project: Drive Atlas v1.0