Commit Graph

8 Commits

Author SHA1 Message Date
jared c45dd007d1 Fix field name mismatches, add events filter, in-place suppression refresh
Lint / Python (flake8) (push) Failing after 50s
Lint / JS (eslint) (push) Successful in 7s
Test / Python Tests (pytest) (push) Successful in 51s
Lint / Notify on failure (push) Successful in 2s
Lint / Deploy (push) Has been skipped
Security / Python Security (bandit) (push) Failing after 59s
- links.html: fix all field name bugs (auto_negotiation→autoneg, full_duplex,
  tx/rx_errors/drops_per_sec→_rate, tx/rx_bytes_per_sec→_rate, poe_total_w/poe_max_w
  computed from ports, renderUnifiSwitches uses top-level updated timestamp)
- suppressions.html: in-place DOM refresh after create/remove (no page reload),
  datalist autocomplete for target names, form reset after submit
- inspector.html: ESC key closes detail panel via lt.keys.on
- index.html: events filter bar with search input + severity pills (All/Critical/Warning),
  MutationObserver re-applies filter after dynamic updates
- style.css: g-section-actions, events-filter-bar, sev-pills layout
- app.js/db.py/monitor.py: carry forward prior session fixes (Promise.allSettled,
  daemon_ok, stale connection handling, double Prometheus call, self.cfg fix)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 23:35:02 -04:00
jared b6cd168542 Clean up suppressions.html: standardise pill attribute and remove inline onclick
Lint / Python (flake8) (push) Failing after 1m26s
Lint / JS (eslint) (push) Successful in 12s
Security / Python Security (bandit) (push) Failing after 1m59s
Test / Python Tests (pytest) (push) Successful in 53s
Lint / Notify on failure (push) Successful in 2s
Lint / Deploy (push) Has been skipped
- Rename data-dur → data-duration to match index.html/app.js convention
- Replace onclick="removeSuppression({{ s.id }})" with data-action="remove-sup" data-sup-id delegation
- Scope pill delegation to #create-suppression-form to avoid cross-page conflicts

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-19 00:01:52 -04:00
jared 293edd674e Integrate TDS v1.2 lt.* APIs throughout app
Lint / Python (flake8) (push) Failing after 58s
Lint / JS (eslint) (push) Successful in 7s
Security / Python Security (bandit) (push) Failing after 44s
Test / Python Tests (pytest) (push) Successful in 1m24s
Lint / Notify on failure (push) Successful in 2s
Lint / Deploy (push) Has been skipped
- app.js: replace raw fetch/escHtml/fmtRelTime with lt.api, lt.escHtml, lt.time.ago; modal open/close via lt.modal; add _toIso() for timestamp normalisation
- index.html: data-action="refresh", data-duration pills, lt.autoRefresh.start, remove local fmtRelTime
- suppressions.html: lt.api.post/delete, data-dur pill delegation
- base.html: user avatar with initials, admin badge, lt.keys.on('r') replaces manual keydown handler
- base.css: add dot-*, chip, row-state aliases so apps can use unprefixed class names

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 23:46:44 -04:00
jared e8de40250a Restructure app to use LotusGuild Terminal Design System v1.2
Lint / Python (flake8) (push) Failing after 45s
Lint / JS (eslint) (push) Successful in 7s
Security / Python Security (bandit) (push) Failing after 1m22s
Test / Python Tests (pytest) (push) Successful in 51s
Lint / Notify on failure (push) Successful in 2s
Lint / Deploy (push) Has been skipped
Replace custom phosphor-green terminal aesthetic with the lt-* component
system from base.css/base.js. All templates now inherit the LotusGuild
multi-accent Anduril palette via variable aliases in style.css, and use
lt-header, lt-nav, lt-card, lt-table, lt-btn, lt-modal, lt-badge etc.
Custom components (topology, inspector chassis, link debug, SFP panels)
are preserved with color values updated to base.css palette variables.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-04-18 21:01:20 -04:00
jared af26407363 Fix setDur implicit event, title XSS, hardcoded pulse URL, suppress error toast
- suppressions.html: setDur() now takes explicit element param instead of relying
  on implicit global event.target (which fails outside direct click handlers)
- suppressions.html: removeSuppression() now shows error toast on failed DELETE
- templates/index.html: escape description in title attribute with |e filter
  to prevent attribute breakout on quotes in description text
- diagnose.py: derive Pulse execution URL from pulse_client.url instead of
  hardcoding http://pulse.lotusguild.org

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-13 14:36:55 -04:00
jared 0278dad502 feat: inspector page, link debug enhancements, security hardening
- Add /inspector page: visual model-accurate switch chassis diagrams
  (USF5P, USL8A, US24PRO, USPPDUP, USMINI), clickable port blocks
  with color coding (green=up, amber=PoE, cyan=uplink, grey=down),
  detail panel with stats/PoE/LLDP, LLDP-based path debug side-by-side

- Link Debug: port number badges (#N), LLDP neighbor line, PoE class/max,
  collapsible host/switch panels with sessionStorage persistence

- monitor.py: collect LLDP neighbor map + PoE class/max/mode per switch
  port; PulseClient uses requests.Session() for HTTP keep-alive; add
  shlex.quote() around interface names (defense-in-depth)

- Security: suppress buttons use data-* attrs + delegated click handler
  instead of inline onclick with Jinja2 variable interpolation; remove
  | safe filter from user-controlled fields in suppressions.html;
  setDuration() takes explicit el param instead of implicit event global

- db.py: thread-local connection reuse with ping(reconnect=True) to
  avoid a new TCP handshake per query

- .gitignore: add config.json (contains credentials), __pycache__

- README: full rewrite covering architecture, all 4 pages, alert logic,
  config reference, deployment, troubleshooting, security notes

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-03 15:39:48 -05:00
jared fa7512a2c2 feat: terminal aesthetic rewrite + link debug page
- Full dark terminal aesthetic (Pulse/TinkerTickets style):
  - #0a0a0a background, #00ff41 green, #ffb000 amber, #00ffff cyan
  - CRT scanline overlay, phosphor glow, ASCII corner pseudoelements
  - Bracket-notation badges [CRITICAL], monospace font throughout
  - style.css, base.html, index.html, suppressions.html all rewritten

- New Link Debug page (/links, /api/links):
  - Per-host, per-interface cards with speed/duplex/port type/auto-neg
  - Traffic bars (TX cyan, RX green) with rate labels
  - Error/drop counters, carrier change history
  - SFP/DOM optical panel: vendor, temp, voltage, bias, TX/RX power dBm bars
  - RX-TX delta shown; color-coded warn/crit thresholds
  - Auto-refresh every 60s, anchor-jump to #hostname

- LinkStatsCollector in monitor.py:
  - SSHes to each host (one connection, all ifaces batched)
  - Parses ethtool + ethtool -m (SFP DOM) output
  - Merges with Prometheus traffic/error/carrier metrics
  - Stores as link_stats in monitor_state table

- config.json: added ssh section for ethtool collection
- app.js: terminal chip style consistency (uppercase, ● bullet)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-02 12:43:11 -05:00
jared 0c0150f698 Complete rewrite: full-featured network monitoring dashboard
- Two-service architecture: Flask web app (gandalf.service) + background
  polling daemon (gandalf-monitor.service)
- Monitor polls Prometheus node_network_up for physical NIC states on all
  6 hypervisors (added storage-01 at 10.10.10.11:9100)
- UniFi API monitoring for switches, APs, and gateway device status
- Ping reachability for hosts without node_exporter (pbs only now)
- Smart baseline: interfaces first seen as down are never alerted on;
  only UP→DOWN regressions trigger tickets
- Cluster-wide P1 ticket when 3+ hosts have genuine simultaneous
  interface regressions (guards against false positives on startup)
- Tinker Tickets integration with 24-hour hash-based deduplication
- Alert suppression: manual toggle or timed windows (30m/1h/4h/8h)
- Authelia SSO via forward-auth headers, admin group required
- Network topology: Internet → UDM-Pro → Agg Switch (10G DAC) →
  PoE Switch (10G DAC) → Hosts
- MariaDB schema, suppression management UI, host/interface cards

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
2026-03-01 23:03:18 -05:00