hwmonDaemon

Author	SHA1	Message	Date
Jared Vititoe	63daa57d80	Fix missing drive capacity in ticket titles Problem: Drive capacity was being extracted but never inserted into ticket titles. The drive_size variable was calculated from drive details but omitted from the ticket_title string construction. Solution: Added drive_size to ticket title format between category and issue. Example ticket titles now show: - Before: "[hostname][auto][hardware]Drive /dev/sda has SMART issues..." - After: "[hostname][auto][hardware][16.0 TB] Drive /dev/sda has SMART issues..." This makes it easier to identify which drives need attention at a glance. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-06 17:15:02 -05:00
Jared Vititoe	72e61bd94e	Add manual execution instructions to README Added comprehensive manual execution section with both: - Direct execution from local file (python3 hwmonDaemon.py) - Remote execution matching systemd service (one-liner download+exec) Both modes include dry-run and normal execution examples for testing and production use. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-06 17:03:27 -05:00
Jared Vititoe	841db13459	Fix false positive ticket creation for manufacturer operation counters Problem: Seagate drives were triggering tickets for "Critical Seek_Error_Rate" and "Critical Command_Timeout" even though these are operation counters used by the manufacturer, not actual errors. Solution: Added filtering in _detect_issues() method to skip known manufacturer operation counters: - Seek_Error_Rate (Seagate/WD operation counter) - Command_Timeout (OOS/Seagate operation counter) - Raw_Read_Error_Rate (Seagate/WD operation counter) These attributes are already correctly excluded from monitoring in manufacturer profiles, but were still appearing in smart_issues list. This fix prevents them from creating tickets while still catching legitimate SMART errors. Changes: - hwmonDaemon.py:1351-1378 - Added operation counter filtering in _detect_issues() - Added debug logging when filtering manufacturer counters 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-06 17:00:32 -05:00
Jared Vititoe	10b548cd79	Update README with hourly execution schedule and recent improvements - Document hourly execution (changed from daily) - Add version 2.0 improvements section - Document 10MB storage limit and automatic cleanup - Clarify service configuration details 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-06 16:57:16 -05:00
Jared Vititoe	fe832c42f3	Fix critical reliability and security issues in hwmonDaemon Critical fixes implemented: - Add 10MB storage limit with automatic cleanup of old history files - Add file locking (fcntl) to prevent race conditions in history writes - Disable SMART monitoring for unreliable Ridata drives - Fix bare except clause in _read_ecc_count() to properly catch errors - Add timeouts to all network and subprocess calls (10s for API, 30s for subprocess) - Fix unchecked regex in ticket creation to prevent AttributeError - Add JSON decode error handling for ticket API responses Service configuration improvements: - hwmon.timer: Reduce jitter from 300s to 60s, add Persistent=true - hwmon.service: Add Restart=on-failure, TimeoutStartSec=300, logging to journal These changes improve reliability, prevent hung processes, eliminate race conditions, and add proper error handling throughout the daemon. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-06 16:55:48 -05:00
Jared Vititoe	0577c7fc1b	add api key support	2026-01-01 16:01:55 -05:00
Jared Vititoe	cc62aabfe4	Merge branch 'main' of 10.10.10.63:LotusGuild/hwmonDaemon	2026-01-01 15:50:30 -05:00
Jared Vititoe	546ef066f8	API Key Auth	2026-01-01 15:45:29 -05:00
Jared Vititoe	9dc3b60a73	Update hwmon.service	2025-11-29 16:04:43 -05:00
Jared Vititoe	0239d64ec3	Update hwmon.service	2025-11-25 20:29:52 -05:00
Jared Vititoe	0326c5142e	Updated hdd temp thresholds	2025-09-03 21:06:12 -04:00
Jared Vititoe	0ab728da47	Better manufactuerer detection and values	2025-09-03 13:14:43 -04:00
Jared Vititoe	4b68b0b525	Added custom config for OOS12000G	2025-09-03 13:02:32 -04:00
Jared Vititoe	2d6626cece	Fixed thesholds for thermals and smart	2025-09-03 12:58:30 -04:00
Jared Vititoe	bc73a691df	data retention and large refactor of codebase	2025-09-03 12:43:16 -04:00
Jared Vititoe	3d902620b0	Removed unnecessary logging	2025-09-02 17:50:05 -04:00
Jared Vititoe	cae4bf031b	Updated priority system	2025-08-17 09:48:25 -04:00
Jared Vititoe	fb1a9f67e1	Updated CPU threshold	2025-07-25 17:36:21 -04:00
Jared Vititoe	0faf7654d6	Huge update to vendor profiles	2025-07-24 19:15:21 -04:00
Jared Vititoe	a74c4c0309	Erase_Fail_Count matched two values	2025-06-24 15:14:35 -04:00
Jared Vititoe	9a700e9853	Attempted fix for lxc storage	2025-05-29 20:23:21 -04:00
Jared Vititoe	1371592b9e	Update LXC storage utilization function	2025-05-29 20:16:50 -04:00
Jared Vititoe	6907f71de1	Updated LXC storage checks	2025-05-29 19:50:17 -04:00
Jared Vititoe	20eb1f9a11	firmware pattern matching	2025-05-29 19:30:06 -04:00
Jared Vititoe	5ac12fd6b7	Correction of deleted code	2025-05-29 19:04:45 -04:00
Jared Vititoe	1e6260a899	Better identification of RiData drives	2025-05-29 19:02:27 -04:00
Jared Vititoe	95a5a8227a	NoneType fix?	2025-05-29 12:44:55 -04:00
Jared Vititoe	f8784eddd2	Added null safety checks	2025-05-29 11:44:07 -04:00
Jared Vititoe	147947b8ca	Testing manufacturer specific smart tests	2025-05-28 14:59:47 -04:00
Jared Vititoe	22bdaa9401	Updated ticket priorities for different drive failures	2025-05-14 21:22:44 -04:00
Jared Vititoe	40b7eb5641	Updated indexcies	2025-05-14 21:17:52 -04:00
Jared Vititoe	6fb0d89519	lxc storage indexcises increased by 1	2025-05-14 21:13:09 -04:00
Jared Vititoe	53b9169da2	test single node change	2025-05-14 21:07:59 -04:00
Jared Vititoe	a34b59ad36	Updated drive firmware checks	2025-05-14 21:01:40 -04:00
Jared Vititoe	0384270dfc	Sofware failure not hardware	2025-05-12 16:12:46 -04:00
Jared Vititoe	1f52a6b4f5	Full traceback to see where error is	2025-05-12 16:04:43 -04:00
Jared Vititoe	c807a6309a	Updated mountpoint catching	2025-05-12 16:00:24 -04:00
Jared Vititoe	3d2fdac3f3	Attempt fix 1	2025-05-12 15:53:32 -04:00
Jared Vititoe	af1121e3d9	Updated drive ticket creation	2025-05-12 15:47:14 -04:00
Jared Vititoe	20f51e0b25	Updated lxc file system matching	2025-05-12 15:41:01 -04:00
Jared Vititoe	4fe0a8dbfc	Different variable for issue type	2025-05-12 15:35:42 -04:00
Jared Vititoe	65ba24e46d	if drive not in issue	2025-05-12 15:32:54 -04:00
Jared Vititoe	e5175f53e5	debug description	2025-05-12 15:22:46 -04:00
Jared Vititoe	bd6d89c4e3	Update make_box	2025-05-12 15:14:06 -04:00
Jared Vititoe	2a6025f5f2	updated parsing	2025-03-09 22:25:29 -04:00
Jared Vititoe	adafa796f1	adjusted lxc storage	2025-03-09 22:17:10 -04:00
Jared Vititoe	f8ea49f099	updated parse size	2025-03-09 22:12:20 -04:00
Jared Vititoe	6a6a400320	adjust regex	2025-03-09 22:09:24 -04:00
Jared Vititoe	8f87403d48	idk	2025-03-09 22:00:23 -04:00
Jared Vititoe	519a30e11e	update parsing	2025-03-09 21:47:41 -04:00

1 2 3 4

165 Commits