Commit Graph

8 Commits

Author SHA1 Message Date
0c0150f698 Complete rewrite: full-featured network monitoring dashboard
- Two-service architecture: Flask web app (gandalf.service) + background
  polling daemon (gandalf-monitor.service)
- Monitor polls Prometheus node_network_up for physical NIC states on all
  6 hypervisors (added storage-01 at 10.10.10.11:9100)
- UniFi API monitoring for switches, APs, and gateway device status
- Ping reachability for hosts without node_exporter (pbs only now)
- Smart baseline: interfaces first seen as down are never alerted on;
  only UP→DOWN regressions trigger tickets
- Cluster-wide P1 ticket when 3+ hosts have genuine simultaneous
  interface regressions (guards against false positives on startup)
- Tinker Tickets integration with 24-hour hash-based deduplication
- Alert suppression: manual toggle or timed windows (30m/1h/4h/8h)
- Authelia SSO via forward-auth headers, admin group required
- Network topology: Internet → UDM-Pro → Agg Switch (10G DAC) →
  PoE Switch (10G DAC) → Hosts
- MariaDB schema, suppression management UI, host/interface cards

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 23:03:18 -05:00
004c97f492 interface update 2025-02-08 00:32:25 -05:00
4c90fbb168 interfaces update 2025-02-07 23:57:34 -05:00
4318dcd0d2 Added interface status 2025-02-07 21:28:54 -05:00
d791312579 Update file structure for Flask 2025-02-07 21:22:43 -05:00
21dfad35bf made everything static 2025-01-04 01:07:18 -05:00
81ba85845b test change 2025-01-04 00:57:42 -05:00
109dff1cd0 test 2025-01-04 00:33:04 -05:00