diff --git a/README.md b/README.md new file mode 100644 index 0000000..a2957ec --- /dev/null +++ b/README.md @@ -0,0 +1,84 @@ +# System Health Monitoring Daemon + +A robust system health monitoring daemon that tracks hardware status and automatically creates tickets for detected issues. + +## Features + +- Comprehensive system health monitoring: + - Drive health (SMART status and disk usage) + - Memory usage + - CPU utilization + - Network connectivity (Management and Ceph networks) +- Automatic ticket creation for detected issues +- Configurable thresholds and monitoring parameters +- Dry-run mode for testing +- Systemd integration for automated daily checks + +## Installation + +1. Copy the service and timer files to systemd: +```bash +sudo cp hwmon.service /etc/systemd/system/ +sudo cp hwmon.timer /etc/systemd/system/ +``` +2. Reload systemd daemon: +```bash +sudo systemctl daemon-reload +``` +3. Enable and start the timer: +```bash +sudo systemctl enable hwmon.timer +sudo systemctl start hwmon.timer +``` + + +## Manual Execution + +1. Run the daemon with dry-run mode to test: +```bash +python3 hwmonDaemon.py --dry-run +``` +2. Run the daemon normally: +```bash +python3 hwmonDaemon.py +``` + + +## Configuration + +The daemon monitors: + +- Disk usage (warns at 80%, critical at 90%) +- Memory usage (warns at 80%) +- CPU usage (warns at 80%) +- Network connectivity to management (10.10.10.1) and Ceph (10.10.90.1) networks +- SMART status of physical drives + +## Ticket Creation + +The daemon automatically creates tickets with: + +- Standardized titles including hostname, hardware type, and scope +- Detailed descriptions of detected issues +- Priority levels based on severity (P2-P4) +- Proper categorization and status tracking + +## Dependencies + +- Python 3 +- Required Python packages: + - psutil + - requests + - smartmontools (for SMART disk monitoring) + +## Service Configuration + +The daemon runs: + +- Daily via systemd timer +- As root user for hardware access +- With automatic restart on failure + +## Security Note + +Ensure proper network security measures are in place as the service downloads and executes code from a specified URL. diff --git a/hwmon.service b/hwmon.service index 387c47e..5464545 100644 --- a/hwmon.service +++ b/hwmon.service @@ -10,4 +10,4 @@ User=root Group=root [Install] -WantedBy=multi-user.target +WantedBy=multi-user.target \ No newline at end of file