Fix host filtering: only show/monitor configured hosts; add PBS

- _collect_snapshot() and _process_interfaces() now skip any Prometheus
  instance not explicitly listed in config.json hosts[]. LXC app servers
  (postgresql, matrix, etc.) report node_exporter metrics but are not
  infrastructure hosts Gandalf should display or alert on.
- Add PBS (10.10.10.3) to config hosts[] with prometheus_instance;
  remove from ping_hosts (node_exporter already running on PBS, now
  added to Prometheus scrape config as job pbs-node).
- The _instance_map membership check is now consistent across snapshot,
  alerting, and ethtool SSH collection.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
2026-03-17 17:17:40 -04:00
parent eb8c0ded5e
commit b80fda7cb2
2 changed files with 16 additions and 9 deletions

View File

@@ -694,6 +694,8 @@ class NetworkMonitor:
hosts_with_regression: List[str] = []
for instance, ifaces in states.items():
if instance not in self._instance_map:
continue # skip unconfigured Prometheus instances
host = self._hostname(instance)
new_baseline.setdefault(host, {})
host_has_regression = False
@@ -877,6 +879,8 @@ class NetworkMonitor:
hosts = {}
for instance, ifaces in iface_states.items():
if instance not in self._instance_map:
continue # skip Prometheus instances not in config (e.g. LXC app servers)
host = self._hostname(instance)
phys = {k: v for k, v in ifaces.items()}
up_count = sum(1 for v in phys.values() if v)