Fix false positive ticket creation for manufacturer operation counters

Problem: Seagate drives were triggering tickets for "Critical Seek_Error_Rate" and "Critical Command_Timeout" even though these are operation counters used by the manufacturer, not actual errors. Solution: Added filtering in _detect_issues() method to skip known manufacturer operation counters: - Seek_Error_Rate (Seagate/WD operation counter) - Command_Timeout (OOS/Seagate operation counter) - Raw_Read_Error_Rate (Seagate/WD operation counter) These attributes are already correctly excluded from monitoring in manufacturer profiles, but were still appearing in smart_issues list. This fix prevents them from creating tickets while still catching legitimate SMART errors. Changes: - hwmonDaemon.py:1351-1378 - Added operation counter filtering in _detect_issues() - Added debug logging when filtering manufacturer counters 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
2026-01-06 17:00:32 -05:00
parent 10b548cd79
commit 841db13459
1 changed files with 17 additions and 4 deletions
--- a/hwmonDaemon.py
+++ b/hwmonDaemon.py
@@ -1350,17 +1350,30 @@ class SystemHealthMonitor:
            # Only report issues for drives with valid SMART status
            if drive.get('smart_issues') and drive.get('smart_status') in ['HEALTHY', 'UNHEALTHY', 'UNKNOWN']:
-                # Filter out generic error messages that don't indicate real hardware issues
+                # Filter out generic error messages and manufacturer-specific false positives
                filtered_issues = []
                for issue in drive['smart_issues']:
-                    if not any(skip_phrase in issue for skip_phrase in [
+                    # Skip generic errors
                    if any(skip_phrase in issue for skip_phrase in [
                        "Error checking SMART:",
                        "Unable to read device information",
                        "SMART not supported",
                        "timed out"
                    ]):
-                        filtered_issues.append(issue)
+                        continue
-                
+
                    # Skip manufacturer-specific operation counters (not actual errors)
                    # These are monitored attributes that manufacturers use as counters
                    if any(counter_name in issue for counter_name in [
                        "Seek_Error_Rate",      # Seagate/WD use as operation counter
                        "Command_Timeout",       # OOS/Seagate use as operation counter
                        "Raw_Read_Error_Rate"   # Seagate/WD use as operation counter
                    ]):
                        logger.debug(f"Filtering manufacturer operation counter from issues: {issue}")
                        continue
                    filtered_issues.append(issue)
                if filtered_issues:
                    issues.append(f"Drive {drive['device']} has SMART issues: {', '.join(filtered_issues)}")