- Add /api/avatar endpoint querying lldap for user jpegPhoto; disk cache
with sentinel pattern avoids repeat LDAP hits for users without photos
- Add ldap3 dependency and ldap config block to config.json
- Wire lt-avatar img overlay in base.html with capture-phase error
fallback (lt-avatar-img-err) to reveal initials when image is absent
- Fix lt-avatar CSS shim: position:relative + absolute inset on img
(local base.css was missing these; added to style.css)
- Replace all empty-state paragraphs with proper lt-empty-state markup
(icon + title + body) across index, suppressions, inspector, app.js
- Add lt-spinner--cyan next to refresh button; shows during refreshAll()
- Replace inspector panel-section-title with lt-divider throughout
- Add data-tooltip attributes to SFP DOM metrics, TX/RX/Carrier/Duplex/
Auto-neg/Error labels in links.html and inspector panel
- Add tooltips to events table column headers (Sev, First Seen, Failures)
- Fix links.html host panel timestamp (was reading sample.updated which
is always undefined; now uses data.updated)
- Fix UniFi status text casing (Online→ONLINE to match server render)
- Remove dead topo-status-* class manipulation from updateTopology()
- Always render alert-count-badge; toggle display:none when count is 0
- Fix double UniFi get_devices() call in monitor.py run loop
- Fix chip-critical animation (was using green pulse-glow; now red)
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
lt-alert:
- Replace custom .stale-banner with lt-alert lt-alert--warning in app.js
and links.html; remove stale-banner CSS, reuse lt-alert margin rule
lt-progress:
- Replace custom .traffic-bar-track/.traffic-bar-fill in links.html with
lt-progress from base.css; TX uses default (orange), RX uses --cyan,
both flip to --red when utilisation >85% (trafficBarClass helper)
- Keep traffic layout classes (.traffic-section/.traffic-row etc.) for structure
Suppression type badges:
- Map target_type to distinct badge colors: host→badge-warning (orange),
interface→badge-info (cyan), unifi_device→badge-purple (new alias using
--accent-purple from base.css), all→badge-critical (red)
- Applied in both server-rendered table (Jinja2 dict lookup) and
renderActiveRows() JS
Topology animated down-wire:
- Add data-host attribute to .topo-v2-wire-10g/.topo-v2-wire-1g elements
- updateTopology() toggles .wire-down class on the 10G drop-wire when
host.status === 'down'
- .wire-down CSS: animated repeating-linear-gradient dashed red line
via wire-dash-anim @keyframes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- app.js: guard events array with || [] before .filter() to prevent crash on null
- app.js: show warning toast when /api/network or /api/status fail (was silent)
- app.js: add aria-label to all dynamically-generated suppress buttons
- index.html: add aria-label to server-rendered suppress buttons
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Add lt-glitch + data-text to brand title (signature glitch effect)
- Add mobile nav drawer (lt-nav-drawer) and hamburger button (lt-menu-btn)
- Add VT323 font, theme toggle button (lt-theme-btn), footer with hints
- Add command palette overlay + lt.cmdPalette.init() with nav commands
- Add keyboard shortcuts help modal and skip link
- Move base.js to <head> so lt.* is available for inline scripts
- Fix suppress modal: lt-modal-backdrop → lt-modal-overlay (base.css class)
- Fix modal open/close: use .is-open / aria-hidden instead of inline style
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Replace custom phosphor-green terminal aesthetic with the lt-* component
system from base.css/base.js. All templates now inherit the LotusGuild
multi-accent Anduril palette via variable aliases in style.css, and use
lt-header, lt-nav, lt-card, lt-table, lt-btn, lt-modal, lt-badge etc.
Custom components (topology, inspector chassis, link debug, SFP panels)
are preserved with color values updated to base.css palette variables.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Wrap window.fetch so any 401 triggers window.location.reload(),
sending the browser back through the Authelia proxy to the login page.
Covers all pages since app.js is loaded by base.html.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- get_active_events() now takes limit/offset (default 200) to cap unbounded queries
- count_active_events() added to return total for pagination display
- /api/events supports ?limit=, ?offset=, ?status= query params (max 1000)
- /api/status includes total_active count alongside paginated events list
- index() route passes total_active to template for server-side truncation notice
- Show "Showing X of Y" notice in dashboard when events are truncated
- Suppression POST validates: reason ≤500 chars, target_name/detail ≤255 chars
- _purge_old_jobs_loop runs purge_old_resolved_events(90d) once per day
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Replace flat topology with tiered bus-bar layout: Internet → UDM-Pro → SVG fork → USW-Agg + Pro 24 PoE → dual-homed servers
- Show 10G VLAN90 (Ceph) bus from USW-Agg and 1G DHCP management bus from Pro 24 PoE per host
- Add per-host drop wires (solid 10G + dashed 1G) with correct rack positions
- Mark large1 as off-rack (dashed border), ZimaBoards as off-rack mon-01/mon-02
- Add topology legend, inter-switch 10G ISL indicator
- Add recently resolved events section (last 24h) to dashboard
- Add last_seen column and relative timestamps to events table
- Add stale data banner when monitoring data >15 min old
- Improve inspector chassis with port speed labels, LLDP neighbor info, mounting ears, chassis legend
- Add duplex/speed mismatch warnings and carrier changes to path debug panel
- Bump updateTopology() to handle both topo-v2-status-* and topo-status-* classes
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- events table: add Last Seen column; show relative times ("3h ago") with
absolute timestamp on hover; update updateEventsTable() in app.js to match
- links.html: add error/drop/flap alert badges to interface and port card headers
- links.html: PoE power bar (draw/max ratio with colour-coded fill) and poe_mode
- links.html: stale data warning banner when link_stats are >2 minutes old
- links.html: improved error handler shows HTTP status instead of generic message
- links.html: fix collapse state persisted to localStorage (was sessionStorage,
lost on browser restart); fix collapseAll/expandAll to also persist state
- inspector.html: duplex mismatch and speed mismatch warnings in path debug panel
- inspector.html: carrier changes added to server column of path debug
- style.css: new classes — .link-alert-badge, .poe-bar-*, .path-mismatch-alert,
.error-state; fix .stale-banner to use CSS variables
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
static/app.js:
- Browser tab title updates to show alert count: '(3 CRIT) GANDALF' or '(2 WARN) GANDALF'
- Stale monitoring banner: injected into .main if last_check > 15 min old,
warns operator that the monitor daemon may be down
static/style.css:
- .stale-banner: amber top-border warning strip
app.py:
- /health now checks DB connectivity and monitor freshness (last_check age)
Returns 503 + degraded status if DB unreachable or monitor stale >20min
db.py:
- cleanup_expired_suppressions(): marks time-limited suppressions inactive when
expires_at <= NOW() (was only filtered in SELECTs, never marked inactive)
- purge_old_resolved_events(days=90): deletes old resolved events to prevent
unbounded table growth
monitor.py:
- Calls cleanup_expired_suppressions() and purge_old_resolved_events() each cycle
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
app.py:
- Context processor injects config.ticket_api.web_url into all templates
(falls back to 'http://t.lotusguild.org/ticket/' if not set in config)
templates/base.html:
- Inject GANDALF_CONFIG JS global with ticket_web_url before app.js loads
static/app.js:
- Use GANDALF_CONFIG.ticket_web_url instead of hardcoded domain
templates/index.html:
- Use {{ config.ticket_api.web_url }} Jinja var instead of hardcoded domain
monitor.py:
- CLUSTER_NAME constant kept as default; NetworkMonitor now reads cluster_name
from config monitor.cluster_name, falling back to the constant
- All CLUSTER_NAME references inside class methods replaced with self.cluster_name
templates/inspector.html:
- pollDiagnostic() .catch() now clears interval and shows error message instead
of silently ignoring network failures during active polling
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- Two-service architecture: Flask web app (gandalf.service) + background
polling daemon (gandalf-monitor.service)
- Monitor polls Prometheus node_network_up for physical NIC states on all
6 hypervisors (added storage-01 at 10.10.10.11:9100)
- UniFi API monitoring for switches, APs, and gateway device status
- Ping reachability for hosts without node_exporter (pbs only now)
- Smart baseline: interfaces first seen as down are never alerted on;
only UP→DOWN regressions trigger tickets
- Cluster-wide P1 ticket when 3+ hosts have genuine simultaneous
interface regressions (guards against false positives on startup)
- Tinker Tickets integration with 24-hour hash-based deduplication
- Alert suppression: manual toggle or timed windows (30m/1h/4h/8h)
- Authelia SSO via forward-auth headers, admin group required
- Network topology: Internet → UDM-Pro → Agg Switch (10G DAC) →
PoE Switch (10G DAC) → Hosts
- MariaDB schema, suppression management UI, host/interface cards
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>