Commit Graph

2 Commits

Author SHA1 Message Date
0278dad502 feat: inspector page, link debug enhancements, security hardening
- Add /inspector page: visual model-accurate switch chassis diagrams
  (USF5P, USL8A, US24PRO, USPPDUP, USMINI), clickable port blocks
  with color coding (green=up, amber=PoE, cyan=uplink, grey=down),
  detail panel with stats/PoE/LLDP, LLDP-based path debug side-by-side

- Link Debug: port number badges (#N), LLDP neighbor line, PoE class/max,
  collapsible host/switch panels with sessionStorage persistence

- monitor.py: collect LLDP neighbor map + PoE class/max/mode per switch
  port; PulseClient uses requests.Session() for HTTP keep-alive; add
  shlex.quote() around interface names (defense-in-depth)

- Security: suppress buttons use data-* attrs + delegated click handler
  instead of inline onclick with Jinja2 variable interpolation; remove
  | safe filter from user-controlled fields in suppressions.html;
  setDuration() takes explicit el param instead of implicit event global

- db.py: thread-local connection reuse with ping(reconnect=True) to
  avoid a new TCP handshake per query

- .gitignore: add config.json (contains credentials), __pycache__

- README: full rewrite covering architecture, all 4 pages, alert logic,
  config reference, deployment, troubleshooting, security notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 15:39:48 -05:00
0c0150f698 Complete rewrite: full-featured network monitoring dashboard
- Two-service architecture: Flask web app (gandalf.service) + background
  polling daemon (gandalf-monitor.service)
- Monitor polls Prometheus node_network_up for physical NIC states on all
  6 hypervisors (added storage-01 at 10.10.10.11:9100)
- UniFi API monitoring for switches, APs, and gateway device status
- Ping reachability for hosts without node_exporter (pbs only now)
- Smart baseline: interfaces first seen as down are never alerted on;
  only UP→DOWN regressions trigger tickets
- Cluster-wide P1 ticket when 3+ hosts have genuine simultaneous
  interface regressions (guards against false positives on startup)
- Tinker Tickets integration with 24-hour hash-based deduplication
- Alert suppression: manual toggle or timed windows (30m/1h/4h/8h)
- Authelia SSO via forward-auth headers, admin group required
- Network topology: Internet → UDM-Pro → Agg Switch (10G DAC) →
  PoE Switch (10G DAC) → Hosts
- MariaDB schema, suppression management UI, host/interface cards

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 23:03:18 -05:00