Fix false positive ticket creation for manufacturer operation counters
Problem: Seagate drives were triggering tickets for "Critical Seek_Error_Rate" and "Critical Command_Timeout" even though these are operation counters used by the manufacturer, not actual errors. Solution: Added filtering in _detect_issues() method to skip known manufacturer operation counters: - Seek_Error_Rate (Seagate/WD operation counter) - Command_Timeout (OOS/Seagate operation counter) - Raw_Read_Error_Rate (Seagate/WD operation counter) These attributes are already correctly excluded from monitoring in manufacturer profiles, but were still appearing in smart_issues list. This fix prevents them from creating tickets while still catching legitimate SMART errors. Changes: - hwmonDaemon.py:1351-1378 - Added operation counter filtering in _detect_issues() - Added debug logging when filtering manufacturer counters 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This commit is contained in:
@@ -1350,17 +1350,30 @@ class SystemHealthMonitor:
|
|||||||
|
|
||||||
# Only report issues for drives with valid SMART status
|
# Only report issues for drives with valid SMART status
|
||||||
if drive.get('smart_issues') and drive.get('smart_status') in ['HEALTHY', 'UNHEALTHY', 'UNKNOWN']:
|
if drive.get('smart_issues') and drive.get('smart_status') in ['HEALTHY', 'UNHEALTHY', 'UNKNOWN']:
|
||||||
# Filter out generic error messages that don't indicate real hardware issues
|
# Filter out generic error messages and manufacturer-specific false positives
|
||||||
filtered_issues = []
|
filtered_issues = []
|
||||||
for issue in drive['smart_issues']:
|
for issue in drive['smart_issues']:
|
||||||
if not any(skip_phrase in issue for skip_phrase in [
|
# Skip generic errors
|
||||||
|
if any(skip_phrase in issue for skip_phrase in [
|
||||||
"Error checking SMART:",
|
"Error checking SMART:",
|
||||||
"Unable to read device information",
|
"Unable to read device information",
|
||||||
"SMART not supported",
|
"SMART not supported",
|
||||||
"timed out"
|
"timed out"
|
||||||
]):
|
]):
|
||||||
filtered_issues.append(issue)
|
continue
|
||||||
|
|
||||||
|
# Skip manufacturer-specific operation counters (not actual errors)
|
||||||
|
# These are monitored attributes that manufacturers use as counters
|
||||||
|
if any(counter_name in issue for counter_name in [
|
||||||
|
"Seek_Error_Rate", # Seagate/WD use as operation counter
|
||||||
|
"Command_Timeout", # OOS/Seagate use as operation counter
|
||||||
|
"Raw_Read_Error_Rate" # Seagate/WD use as operation counter
|
||||||
|
]):
|
||||||
|
logger.debug(f"Filtering manufacturer operation counter from issues: {issue}")
|
||||||
|
continue
|
||||||
|
|
||||||
|
filtered_issues.append(issue)
|
||||||
|
|
||||||
if filtered_issues:
|
if filtered_issues:
|
||||||
issues.append(f"Drive {drive['device']} has SMART issues: {', '.join(filtered_issues)}")
|
issues.append(f"Drive {drive['device']} has SMART issues: {', '.join(filtered_issues)}")
|
||||||
|
|
||||||
|
|||||||
Reference in New Issue
Block a user