# System Health Monitoring Daemon A robust system health monitoring daemon that tracks hardware status and automatically creates tickets for detected issues. ## Features - Comprehensive system health monitoring: - Drive health (SMART status and disk usage) - Memory usage - CPU utilization - Network connectivity (Management and Ceph networks) - Automatic ticket creation for detected issues - Configurable thresholds and monitoring parameters - Dry-run mode for testing - Systemd integration for automated daily checks ## Installation 1. Copy the service and timer files to systemd: ```bash sudo cp hwmon.service /etc/systemd/system/ sudo cp hwmon.timer /etc/systemd/system/ ``` 2. Reload systemd daemon: ```bash sudo systemctl daemon-reload ``` 3. Enable and start the timer: ```bash sudo systemctl enable hwmon.timer sudo systemctl start hwmon.timer ``` ## Manual Execution 1. Run the daemon with dry-run mode to test: ```bash python3 hwmonDaemon.py --dry-run ``` 2. Run the daemon normally: ```bash python3 hwmonDaemon.py ``` ## Configuration The daemon monitors: - Disk usage (warns at 80%, critical at 90%) - Memory usage (warns at 80%) - CPU usage (warns at 80%) - Network connectivity to management (10.10.10.1) and Ceph (10.10.90.1) networks - SMART status of physical drives ## Ticket Creation The daemon automatically creates tickets with: - Standardized titles including hostname, hardware type, and scope - Detailed descriptions of detected issues - Priority levels based on severity (P2-P4) - Proper categorization and status tracking ## Dependencies - Python 3 - Required Python packages: - psutil - requests - smartmontools (for SMART disk monitoring) ## Service Configuration The daemon runs: - Daily via systemd timer - As root user for hardware access - With automatic restart on failure ## Security Note Ensure proper network security measures are in place as the service downloads and executes code from a specified URL.