Commit Graph

2 Commits

Author SHA1 Message Date
8f852ed830 Add compound DB indexes for hot query paths
network_events: idx_event_lookup (event_type, target_name, target_detail, resolved_at)
  - Covers the upsert_event SELECT which runs every cycle per monitored entity
  - Replaces three separate single-column index scans with one covering lookup

suppression_rules: idx_sup_lookup (active, target_type, target_name, target_detail)
  - Covers is_suppressed() queries (now redundant for runtime due to in-memory
    check_suppressed, but ensures fast get_active_suppressions() loading per cycle)

Both indexes created on live DB (MariaDB LXC 149).

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-14 14:24:40 -04:00
0c0150f698 Complete rewrite: full-featured network monitoring dashboard
- Two-service architecture: Flask web app (gandalf.service) + background
  polling daemon (gandalf-monitor.service)
- Monitor polls Prometheus node_network_up for physical NIC states on all
  6 hypervisors (added storage-01 at 10.10.10.11:9100)
- UniFi API monitoring for switches, APs, and gateway device status
- Ping reachability for hosts without node_exporter (pbs only now)
- Smart baseline: interfaces first seen as down are never alerted on;
  only UP→DOWN regressions trigger tickets
- Cluster-wide P1 ticket when 3+ hosts have genuine simultaneous
  interface regressions (guards against false positives on startup)
- Tinker Tickets integration with 24-hour hash-based deduplication
- Alert suppression: manual toggle or timed windows (30m/1h/4h/8h)
- Authelia SSO via forward-auth headers, admin group required
- Network topology: Internet → UDM-Pro → Agg Switch (10G DAC) →
  PoE Switch (10G DAC) → Hosts
- MariaDB schema, suppression management UI, host/interface cards

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 23:03:18 -05:00