- Add /api/avatar endpoint querying lldap for user jpegPhoto; disk cache
with sentinel pattern avoids repeat LDAP hits for users without photos
- Add ldap3 dependency and ldap config block to config.json
- Wire lt-avatar img overlay in base.html with capture-phase error
fallback (lt-avatar-img-err) to reveal initials when image is absent
- Fix lt-avatar CSS shim: position:relative + absolute inset on img
(local base.css was missing these; added to style.css)
- Replace all empty-state paragraphs with proper lt-empty-state markup
(icon + title + body) across index, suppressions, inspector, app.js
- Add lt-spinner--cyan next to refresh button; shows during refreshAll()
- Replace inspector panel-section-title with lt-divider throughout
- Add data-tooltip attributes to SFP DOM metrics, TX/RX/Carrier/Duplex/
Auto-neg/Error labels in links.html and inspector panel
- Add tooltips to events table column headers (Sev, First Seen, Failures)
- Fix links.html host panel timestamp (was reading sample.updated which
is always undefined; now uses data.updated)
- Fix UniFi status text casing (Online→ONLINE to match server render)
- Remove dead topo-status-* class manipulation from updateTopology()
- Always render alert-count-badge; toggle display:none when count is 0
- Fix double UniFi get_devices() call in monitor.py run loop
- Fix chip-critical animation (was using green pulse-glow; now red)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add links_exclude_ips to monitor config; collect() skips any Prometheus
instance whose IP is in that list, preventing LXC containers from
appearing on the links/inspector pages as phantom hosts.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- _collect_snapshot() and _process_interfaces() now skip any Prometheus
instance not explicitly listed in config.json hosts[]. LXC app servers
(postgresql, matrix, etc.) report node_exporter metrics but are not
infrastructure hosts Gandalf should display or alert on.
- Add PBS (10.10.10.3) to config hosts[] with prometheus_instance;
remove from ping_hosts (node_exporter already running on PBS, now
added to Prometheus scrape config as job pbs-node).
- The _instance_map membership check is now consistent across snapshot,
alerting, and ethtool SSH collection.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Two-service architecture: Flask web app (gandalf.service) + background
polling daemon (gandalf-monitor.service)
- Monitor polls Prometheus node_network_up for physical NIC states on all
6 hypervisors (added storage-01 at 10.10.10.11:9100)
- UniFi API monitoring for switches, APs, and gateway device status
- Ping reachability for hosts without node_exporter (pbs only now)
- Smart baseline: interfaces first seen as down are never alerted on;
only UP→DOWN regressions trigger tickets
- Cluster-wide P1 ticket when 3+ hosts have genuine simultaneous
interface regressions (guards against false positives on startup)
- Tinker Tickets integration with 24-hour hash-based deduplication
- Alert suppression: manual toggle or timed windows (30m/1h/4h/8h)
- Authelia SSO via forward-auth headers, admin group required
- Network topology: Internet → UDM-Pro → Agg Switch (10G DAC) →
PoE Switch (10G DAC) → Hosts
- MariaDB schema, suppression management UI, host/interface cards
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>