Complete rewrite: full-featured network monitoring dashboard
- Two-service architecture: Flask web app (gandalf.service) + background polling daemon (gandalf-monitor.service) - Monitor polls Prometheus node_network_up for physical NIC states on all 6 hypervisors (added storage-01 at 10.10.10.11:9100) - UniFi API monitoring for switches, APs, and gateway device status - Ping reachability for hosts without node_exporter (pbs only now) - Smart baseline: interfaces first seen as down are never alerted on; only UP→DOWN regressions trigger tickets - Cluster-wide P1 ticket when 3+ hosts have genuine simultaneous interface regressions (guards against false positives on startup) - Tinker Tickets integration with 24-hour hash-based deduplication - Alert suppression: manual toggle or timed windows (30m/1h/4h/8h) - Authelia SSO via forward-auth headers, admin group required - Network topology: Internet → UDM-Pro → Agg Switch (10G DAC) → PoE Switch (10G DAC) → Hosts - MariaDB schema, suppression management UI, host/interface cards Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This commit is contained in:
218
README.md
218
README.md
@@ -2,61 +2,199 @@
|
||||
|
||||
> Because it shall not let problems pass!
|
||||
|
||||
## Multiple Distributed Servers Approach
|
||||
Network monitoring dashboard for the LotusGuild Proxmox cluster.
|
||||
Deployed on **LXC 157** (monitor-02 / 10.10.10.9), reachable at `gandalf.lotusguild.org`.
|
||||
|
||||
This architecture represents the most robust implementation approach for the system.
|
||||
---
|
||||
|
||||
### Core Components
|
||||
## Architecture
|
||||
|
||||
1. Multiple monitoring nodes across different network segments
|
||||
2. Distributed database for sharing state
|
||||
3. Consensus mechanism for alert verification
|
||||
Gandalf is two processes that share a MariaDB database:
|
||||
|
||||
### System Architecture
|
||||
| Process | Service | Role |
|
||||
|---|---|---|
|
||||
| `app.py` | `gandalf.service` | Flask web dashboard (gunicorn, port 8000) |
|
||||
| `monitor.py` | `gandalf-monitor.service` | Background polling daemon |
|
||||
|
||||
#### A. Monitoring Layer
|
||||
```
|
||||
[Prometheus :9090] ──▶
|
||||
monitor.py ──▶ MariaDB ◀── app.py ──▶ nginx ──▶ Authelia ──▶ Browser
|
||||
[UniFi Controller] ──▶
|
||||
```
|
||||
|
||||
- Multiple monitoring nodes in different locations/segments
|
||||
- Each node runs independent health checks
|
||||
- Mix of internal and external perspectives
|
||||
### Data Sources
|
||||
|
||||
#### B. Data Collection
|
||||
| Source | What it monitors |
|
||||
|---|---|
|
||||
| **Prometheus** (`10.10.10.48:9090`) | Physical NIC link state (`node_network_up`) for 5 Proxmox hypervisors |
|
||||
| **UniFi API** (`https://10.10.10.1`) | Switch, AP, and gateway device status |
|
||||
| **Ping** | pbs (10.10.10.3) and storage-01 (10.10.10.11) — no node_exporter |
|
||||
|
||||
Each node collects:
|
||||
- Link status
|
||||
- Latency measurements
|
||||
- Error rates
|
||||
- Bandwidth utilization
|
||||
- Device health metrics
|
||||
### Monitored Hosts (Prometheus / node_exporter)
|
||||
|
||||
#### C. Consensus Mechanism
|
||||
| Host | Instance |
|
||||
|---|---|
|
||||
| large1 | 10.10.10.2:9100 |
|
||||
| compute-storage-01 | 10.10.10.4:9100 |
|
||||
| micro1 | 10.10.10.8:9100 |
|
||||
| monitor-02 | 10.10.10.9:9100 |
|
||||
| compute-storage-gpu-01 | 10.10.10.10:9100 |
|
||||
|
||||
- Multiple nodes must agree before declaring an outage
|
||||
- Voting system implementation:
|
||||
- 2/3 node agreement required for issue confirmation
|
||||
- Weighted checks based on type
|
||||
- Time-based consensus requirements (X seconds persistence)
|
||||
---
|
||||
|
||||
#### D. Alert Verification
|
||||
## Features
|
||||
|
||||
- Cross-reference multiple data points
|
||||
- Check from different network paths
|
||||
- Verify both ends of connections
|
||||
- Consider network topology
|
||||
- **Interface monitoring** – tracks link state for all physical NICs via Prometheus
|
||||
- **UniFi device monitoring** – detects offline switches, APs, and gateways
|
||||
- **Ping reachability** – covers hosts without node_exporter
|
||||
- **Cluster-wide detection** – creates a separate P1 ticket when 3+ hosts have simultaneous interface failures (likely a switch failure)
|
||||
- **Smart baseline tracking** – interfaces that are down on first observation (unused ports) are never alerted on; only regressions from UP→DOWN trigger tickets
|
||||
- **Ticket creation** – integrates with Tinker Tickets (`t.lotusguild.org`) with 24-hour deduplication
|
||||
- **Alert suppression** – manual toggle or timed windows (30min / 1hr / 4hr / 8hr / manual)
|
||||
- **Authelia SSO** – restricted to `admin` group via forward-auth headers
|
||||
|
||||
#### E. Redundancy
|
||||
---
|
||||
|
||||
- Eliminates single points of failure
|
||||
- Nodes distributed across availability zones
|
||||
- Independent power and network paths
|
||||
## Alert Logic
|
||||
|
||||
#### F. Central Coordination
|
||||
### Ticket Triggers
|
||||
|
||||
- Distributed database for state sharing
|
||||
- Leader election for coordinating responses
|
||||
- Backup coordinators ready to take over
|
||||
| Condition | Priority |
|
||||
|---|---|
|
||||
| UniFi device offline (2+ consecutive checks) | P2 High |
|
||||
| Proxmox host NIC link-down regression (2+ consecutive checks) | P2 High |
|
||||
| Host unreachable via ping (2+ consecutive checks) | P2 High |
|
||||
| 3+ hosts simultaneously reporting interface failures | P1 Critical |
|
||||
|
||||
### Additional Features
|
||||
### Suppression Targets
|
||||
|
||||
- Alarm suppression capabilities
|
||||
- Ticket creation system integration
|
||||
| Type | Suppresses |
|
||||
|---|---|
|
||||
| `host` | All interface alerts for a named host |
|
||||
| `interface` | A specific NIC on a specific host |
|
||||
| `unifi_device` | A specific UniFi device |
|
||||
| `all` | Everything (global maintenance mode) |
|
||||
|
||||
Suppressions can be manual (persist until removed) or timed (auto-expire).
|
||||
|
||||
---
|
||||
|
||||
## Configuration
|
||||
|
||||
**`config.json`** – shared by both processes:
|
||||
|
||||
| Key | Description |
|
||||
|---|---|
|
||||
| `unifi.api_key` | UniFi API key from controller |
|
||||
| `prometheus.url` | Prometheus base URL |
|
||||
| `database.*` | MariaDB credentials |
|
||||
| `ticket_api.api_key` | Tinker Tickets Bearer token |
|
||||
| `monitor.poll_interval` | Seconds between checks (default: 120) |
|
||||
| `monitor.failure_threshold` | Consecutive failures before ticketing (default: 2) |
|
||||
| `monitor.cluster_threshold` | Hosts with failures to trigger cluster alert (default: 3) |
|
||||
| `monitor.ping_hosts` | Hosts checked via ping (no node_exporter) |
|
||||
| `hosts` | Maps Prometheus instance labels to hostnames |
|
||||
|
||||
---
|
||||
|
||||
## Deployment (LXC 157)
|
||||
|
||||
### 1. Database (MariaDB LXC 149 at 10.10.10.50)
|
||||
|
||||
```sql
|
||||
CREATE DATABASE gandalf CHARACTER SET utf8mb4 COLLATE utf8mb4_unicode_ci;
|
||||
CREATE USER 'gandalf'@'10.10.10.61' IDENTIFIED BY 'your_password';
|
||||
GRANT ALL PRIVILEGES ON gandalf.* TO 'gandalf'@'10.10.10.61';
|
||||
FLUSH PRIVILEGES;
|
||||
```
|
||||
|
||||
Then import the schema:
|
||||
```bash
|
||||
mysql -h 10.10.10.50 -u gandalf -p gandalf < schema.sql
|
||||
```
|
||||
|
||||
### 2. LXC 157 – Install dependencies
|
||||
|
||||
```bash
|
||||
pip3 install -r requirements.txt
|
||||
```
|
||||
|
||||
### 3. Deploy files
|
||||
|
||||
```bash
|
||||
cp app.py db.py monitor.py config.json templates/ static/ /var/www/html/prod/
|
||||
```
|
||||
|
||||
### 4. Configure secrets in `config.json`
|
||||
|
||||
- `database.password` – set the gandalf DB password
|
||||
- `ticket_api.api_key` – copy from tinker tickets admin panel
|
||||
|
||||
### 5. Install the monitor service
|
||||
|
||||
```bash
|
||||
cp gandalf-monitor.service /etc/systemd/system/
|
||||
systemctl daemon-reload
|
||||
systemctl enable gandalf-monitor
|
||||
systemctl start gandalf-monitor
|
||||
```
|
||||
|
||||
Update existing `gandalf.service` to use a single worker:
|
||||
```
|
||||
ExecStart=/usr/bin/python3 -m gunicorn --workers 1 --bind 127.0.0.1:8000 app:app
|
||||
```
|
||||
|
||||
### 6. Authelia rule
|
||||
|
||||
Add to `/etc/authelia/configuration.yml` access_control rules:
|
||||
```yaml
|
||||
- domain: gandalf.lotusguild.org
|
||||
policy: one_factor
|
||||
subject:
|
||||
- group:admin
|
||||
```
|
||||
|
||||
Reload Authelia: `systemctl reload authelia`
|
||||
|
||||
### 7. NPM proxy host
|
||||
|
||||
- Domain: `gandalf.lotusguild.org`
|
||||
- Forward to: `http://10.10.10.61:80` (nginx on LXC 157)
|
||||
- Enable Authelia forward auth
|
||||
- WebSockets: **not required**
|
||||
|
||||
---
|
||||
|
||||
## Service Management
|
||||
|
||||
```bash
|
||||
# Monitor daemon
|
||||
systemctl status gandalf-monitor
|
||||
journalctl -u gandalf-monitor -f
|
||||
|
||||
# Web server
|
||||
systemctl status gandalf
|
||||
journalctl -u gandalf -f
|
||||
|
||||
# Restart both after config/code changes
|
||||
systemctl restart gandalf-monitor gandalf
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Troubleshooting
|
||||
|
||||
**Monitor not creating tickets**
|
||||
- Check `config.json` → `ticket_api.api_key` is set
|
||||
- Check `journalctl -u gandalf-monitor` for errors
|
||||
|
||||
**Baseline re-initializing on every restart**
|
||||
- `interface_baseline` is stored in the `monitor_state` DB table; it persists across restarts
|
||||
|
||||
**Interface always showing as "initial_down"**
|
||||
- That interface was down on the first poll after the monitor started
|
||||
- It will begin tracking once it comes up; or manually update the baseline in DB if needed
|
||||
|
||||
**Prometheus data missing for a host**
|
||||
- Verify node_exporter is running: `systemctl status prometheus-node-exporter`
|
||||
- Check Prometheus targets: `http://10.10.10.48:9090/targets`
|
||||
|
||||
313
app.py
313
app.py
@@ -1,144 +1,207 @@
|
||||
import logging
|
||||
"""Gandalf – Global Advanced Network Detection And Link Facilitator.
|
||||
|
||||
Flask web application serving the monitoring dashboard and suppression
|
||||
management UI. Authentication via Authelia forward-auth headers.
|
||||
All monitoring and alerting is handled by the separate monitor.py daemon.
|
||||
"""
|
||||
import json
|
||||
import platform
|
||||
import subprocess
|
||||
import threading
|
||||
import time
|
||||
from datetime import datetime
|
||||
from flask import Flask, render_template, jsonify
|
||||
import requests
|
||||
from urllib3.exceptions import InsecureRequestWarning
|
||||
import logging
|
||||
from functools import wraps
|
||||
|
||||
from flask import Flask, jsonify, redirect, render_template, request, url_for
|
||||
|
||||
import db
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format='%(asctime)s %(levelname)s %(name)s %(message)s',
|
||||
)
|
||||
logger = logging.getLogger('gandalf.web')
|
||||
|
||||
logging.basicConfig(level=logging.DEBUG)
|
||||
logger = logging.getLogger(__name__)
|
||||
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
|
||||
app = Flask(__name__)
|
||||
device_status = {}
|
||||
|
||||
def load_config():
|
||||
_cfg = None
|
||||
|
||||
|
||||
def _config() -> dict:
|
||||
global _cfg
|
||||
if _cfg is None:
|
||||
with open('config.json') as f:
|
||||
return json.load(f)
|
||||
_cfg = json.load(f)
|
||||
return _cfg
|
||||
|
||||
class UnifiAPI:
|
||||
def __init__(self, config):
|
||||
self.base_url = config['unifi']['controller']
|
||||
self.session = requests.Session()
|
||||
self.session.verify = False
|
||||
self.headers = {
|
||||
'X-API-KEY': config['unifi']['api_key'],
|
||||
'Accept': 'application/json'
|
||||
}
|
||||
self.site_id = "default"
|
||||
|
||||
def get_devices(self):
|
||||
try:
|
||||
url = f"{self.base_url}/proxy/network/v2/api/site/{self.site_id}/device"
|
||||
response = self.session.get(url, headers=self.headers)
|
||||
response.raise_for_status()
|
||||
# ---------------------------------------------------------------------------
|
||||
# Auth helpers
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
# Log raw response
|
||||
logger.debug(f"Response status: {response.status_code}")
|
||||
logger.debug(f"Response headers: {response.headers}")
|
||||
logger.debug(f"Raw response text: {response.text}")
|
||||
|
||||
devices_data = response.json()
|
||||
logger.debug(f"Parsed JSON: {devices_data}")
|
||||
|
||||
# Extract network_devices from the response
|
||||
network_devices = devices_data.get('network_devices', [])
|
||||
|
||||
devices = []
|
||||
for device in network_devices:
|
||||
devices.append({
|
||||
'name': device.get('name', 'Unknown'),
|
||||
'ip': device.get('ip', '0.0.0.0'),
|
||||
'type': device.get('type', 'unknown'),
|
||||
'connection_type': 'fiber' if device.get('uplink', {}).get('media') == 'sfp' else 'copper',
|
||||
'critical': True if device.get('type') in ['udm', 'usw'] else False,
|
||||
'device_id': device.get('mac')
|
||||
})
|
||||
|
||||
logger.debug(f"Processed devices: {devices}")
|
||||
return devices
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f"Error fetching devices: {e}")
|
||||
logger.exception("Full traceback:")
|
||||
return []
|
||||
|
||||
def get_device_details(self, device_id):
|
||||
try:
|
||||
url = f"{self.base_url}/proxy/network/v2/api/site/{self.site_id}/device/{device_id}"
|
||||
response = self.session.get(url, headers=self.headers)
|
||||
response.raise_for_status()
|
||||
return response.json()
|
||||
except Exception as e:
|
||||
logger.error(f"Failed to get device details: {e}")
|
||||
return None
|
||||
|
||||
def get_device_diagnostics(self, device):
|
||||
details = self.get_device_details(device['device_id'])
|
||||
if not details:
|
||||
return {'state': 'ERROR', 'error': 'Failed to fetch device details'}
|
||||
|
||||
diagnostics = {
|
||||
'state': details.get('state', 'unknown'),
|
||||
'interfaces': {
|
||||
'ports': {}
|
||||
}
|
||||
def _get_user() -> dict:
|
||||
return {
|
||||
'username': request.headers.get('Remote-User', ''),
|
||||
'name': request.headers.get('Remote-Name', ''),
|
||||
'email': request.headers.get('Remote-Email', ''),
|
||||
'groups': [
|
||||
g.strip()
|
||||
for g in request.headers.get('Remote-Groups', '').split(',')
|
||||
if g.strip()
|
||||
],
|
||||
}
|
||||
|
||||
# Parse port information
|
||||
for port in details.get('port_table', []):
|
||||
diagnostics['interfaces']['ports'][f"Port {port.get('port_idx')}"] = {
|
||||
'state': 'up' if port.get('up') else 'down',
|
||||
'speed': {
|
||||
'current': port.get('speed', 0),
|
||||
'max': port.get('max_speed', 0)
|
||||
},
|
||||
'poe': port.get('poe_enable', False),
|
||||
'media': port.get('media', 'unknown')
|
||||
}
|
||||
|
||||
return diagnostics
|
||||
def require_auth(f):
|
||||
@wraps(f)
|
||||
def wrapper(*args, **kwargs):
|
||||
user = _get_user()
|
||||
if not user['username']:
|
||||
return (
|
||||
'<h1>401 – Not authenticated</h1>'
|
||||
'<p>Please access Gandalf through '
|
||||
'<a href="https://auth.lotusguild.org">auth.lotusguild.org</a>.</p>',
|
||||
401,
|
||||
)
|
||||
allowed = _config().get('auth', {}).get('allowed_groups', ['admin'])
|
||||
if not any(g in allowed for g in user['groups']):
|
||||
return (
|
||||
f'<h1>403 – Access denied</h1>'
|
||||
f'<p>Your account ({user["username"]}) is not in an allowed group '
|
||||
f'({", ".join(allowed)}).</p>',
|
||||
403,
|
||||
)
|
||||
return f(*args, **kwargs)
|
||||
return wrapper
|
||||
|
||||
def _parse_interfaces(self, interfaces):
|
||||
result = {
|
||||
'ports': {},
|
||||
'radios': {}
|
||||
}
|
||||
for port in interfaces:
|
||||
result['ports'][f"port_{port['index']}"] = {
|
||||
'state': port['up'] and 'up' or 'down',
|
||||
'speed': {
|
||||
'current': port.get('speed', 0),
|
||||
'max': port.get('max_speed', 0)
|
||||
}
|
||||
}
|
||||
return result
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Page routes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@app.route('/')
|
||||
def home():
|
||||
config = load_config()
|
||||
unifi = UnifiAPI(config)
|
||||
devices = unifi.get_devices()
|
||||
return render_template('index.html', devices=devices)
|
||||
@require_auth
|
||||
def index():
|
||||
user = _get_user()
|
||||
events = db.get_active_events()
|
||||
summary = db.get_status_summary()
|
||||
snapshot_raw = db.get_state('network_snapshot')
|
||||
last_check = db.get_state('last_check', 'Never')
|
||||
snapshot = json.loads(snapshot_raw) if snapshot_raw else {}
|
||||
suppressions = db.get_active_suppressions()
|
||||
return render_template(
|
||||
'index.html',
|
||||
user=user,
|
||||
events=events,
|
||||
summary=summary,
|
||||
snapshot=snapshot,
|
||||
last_check=last_check,
|
||||
suppressions=suppressions,
|
||||
)
|
||||
|
||||
|
||||
@app.route('/suppressions')
|
||||
@require_auth
|
||||
def suppressions_page():
|
||||
user = _get_user()
|
||||
active = db.get_active_suppressions()
|
||||
history = db.get_suppression_history(limit=50)
|
||||
snapshot_raw = db.get_state('network_snapshot')
|
||||
snapshot = json.loads(snapshot_raw) if snapshot_raw else {}
|
||||
return render_template(
|
||||
'suppressions.html',
|
||||
user=user,
|
||||
active=active,
|
||||
history=history,
|
||||
snapshot=snapshot,
|
||||
)
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# API routes
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
@app.route('/api/status')
|
||||
def status():
|
||||
return jsonify(device_status)
|
||||
@require_auth
|
||||
def api_status():
|
||||
return jsonify({
|
||||
'summary': db.get_status_summary(),
|
||||
'last_check': db.get_state('last_check', 'Never'),
|
||||
'events': db.get_active_events(),
|
||||
})
|
||||
|
||||
|
||||
@app.route('/api/network')
|
||||
@require_auth
|
||||
def api_network():
|
||||
raw = db.get_state('network_snapshot')
|
||||
if raw:
|
||||
try:
|
||||
return jsonify(json.loads(raw))
|
||||
except Exception:
|
||||
pass
|
||||
return jsonify({'hosts': {}, 'unifi': [], 'updated': None})
|
||||
|
||||
|
||||
@app.route('/api/events')
|
||||
@require_auth
|
||||
def api_events():
|
||||
return jsonify({
|
||||
'active': db.get_active_events(),
|
||||
'resolved': db.get_recent_resolved(hours=24, limit=30),
|
||||
})
|
||||
|
||||
|
||||
@app.route('/api/suppressions', methods=['GET'])
|
||||
@require_auth
|
||||
def api_get_suppressions():
|
||||
return jsonify(db.get_active_suppressions())
|
||||
|
||||
|
||||
@app.route('/api/suppressions', methods=['POST'])
|
||||
@require_auth
|
||||
def api_create_suppression():
|
||||
user = _get_user()
|
||||
data = request.get_json(silent=True) or {}
|
||||
|
||||
target_type = data.get('target_type', 'host')
|
||||
target_name = (data.get('target_name') or '').strip()
|
||||
target_detail = (data.get('target_detail') or '').strip()
|
||||
reason = (data.get('reason') or '').strip()
|
||||
expires_minutes = data.get('expires_minutes') # None = manual/permanent
|
||||
|
||||
if target_type not in ('host', 'interface', 'unifi_device', 'all'):
|
||||
return jsonify({'error': 'Invalid target_type'}), 400
|
||||
if target_type != 'all' and not target_name:
|
||||
return jsonify({'error': 'target_name required'}), 400
|
||||
if not reason:
|
||||
return jsonify({'error': 'reason required'}), 400
|
||||
|
||||
sup_id = db.create_suppression(
|
||||
target_type=target_type,
|
||||
target_name=target_name,
|
||||
target_detail=target_detail,
|
||||
reason=reason,
|
||||
suppressed_by=user['username'],
|
||||
expires_minutes=int(expires_minutes) if expires_minutes else None,
|
||||
)
|
||||
logger.info(
|
||||
f'Suppression #{sup_id} created by {user["username"]}: '
|
||||
f'{target_type}/{target_name}/{target_detail} – {reason}'
|
||||
)
|
||||
return jsonify({'success': True, 'id': sup_id})
|
||||
|
||||
|
||||
@app.route('/api/suppressions/<int:sup_id>', methods=['DELETE'])
|
||||
@require_auth
|
||||
def api_delete_suppression(sup_id: int):
|
||||
user = _get_user()
|
||||
db.deactivate_suppression(sup_id)
|
||||
logger.info(f'Suppression #{sup_id} removed by {user["username"]}')
|
||||
return jsonify({'success': True})
|
||||
|
||||
|
||||
@app.route('/health')
|
||||
def health():
|
||||
"""Health check endpoint (no auth)."""
|
||||
return jsonify({'status': 'ok', 'service': 'gandalf'})
|
||||
|
||||
@app.route('/api/diagnostics')
|
||||
def get_diagnostics():
|
||||
config = load_config()
|
||||
unifi = UnifiAPI(config)
|
||||
devices = unifi.get_devices()
|
||||
diagnostics = {}
|
||||
for device in devices:
|
||||
diagnostics[device['name']] = unifi.get_device_diagnostics(device)
|
||||
return jsonify(diagnostics)
|
||||
|
||||
if __name__ == '__main__':
|
||||
status_thread = threading.Thread(target=update_status, daemon=True)
|
||||
status_thread.start()
|
||||
app.run(debug=True)
|
||||
app.run(debug=True, host='0.0.0.0', port=5000)
|
||||
|
||||
58
config.json
58
config.json
@@ -4,5 +4,61 @@
|
||||
"api_key": "kyPfIsAVie3hwMD4Bc1MjAu8N7HVPIb8",
|
||||
"site_id": "default"
|
||||
},
|
||||
"check_interval": 30
|
||||
"prometheus": {
|
||||
"url": "http://10.10.10.48:9090"
|
||||
},
|
||||
"database": {
|
||||
"host": "10.10.10.50",
|
||||
"port": 3306,
|
||||
"user": "gandalf",
|
||||
"password": "Gandalf2026Lotus",
|
||||
"name": "gandalf"
|
||||
},
|
||||
"ticket_api": {
|
||||
"url": "http://10.10.10.45/create_ticket_api.php",
|
||||
"api_key": "5acc5d3c647b84f7c6f59082ce4450ee772e2d1633238b960136f653d20c93af"
|
||||
},
|
||||
"auth": {
|
||||
"allowed_groups": ["admin"]
|
||||
},
|
||||
"monitor": {
|
||||
"poll_interval": 120,
|
||||
"failure_threshold": 2,
|
||||
"cluster_threshold": 3,
|
||||
"ping_hosts": [
|
||||
{"name": "pbs", "ip": "10.10.10.3"}
|
||||
]
|
||||
},
|
||||
"hosts": [
|
||||
{
|
||||
"name": "large1",
|
||||
"ip": "10.10.10.2",
|
||||
"prometheus_instance": "10.10.10.2:9100"
|
||||
},
|
||||
{
|
||||
"name": "compute-storage-01",
|
||||
"ip": "10.10.10.4",
|
||||
"prometheus_instance": "10.10.10.4:9100"
|
||||
},
|
||||
{
|
||||
"name": "micro1",
|
||||
"ip": "10.10.10.8",
|
||||
"prometheus_instance": "10.10.10.8:9100"
|
||||
},
|
||||
{
|
||||
"name": "monitor-02",
|
||||
"ip": "10.10.10.9",
|
||||
"prometheus_instance": "10.10.10.9:9100"
|
||||
},
|
||||
{
|
||||
"name": "compute-storage-gpu-01",
|
||||
"ip": "10.10.10.10",
|
||||
"prometheus_instance": "10.10.10.10:9100"
|
||||
},
|
||||
{
|
||||
"name": "storage-01",
|
||||
"ip": "10.10.10.11",
|
||||
"prometheus_instance": "10.10.10.11:9100"
|
||||
}
|
||||
]
|
||||
}
|
||||
304
db.py
Normal file
304
db.py
Normal file
@@ -0,0 +1,304 @@
|
||||
"""Database operations for Gandalf network monitor."""
|
||||
import json
|
||||
import logging
|
||||
from contextlib import contextmanager
|
||||
from datetime import datetime, timedelta
|
||||
from typing import Optional
|
||||
|
||||
import pymysql
|
||||
import pymysql.cursors
|
||||
|
||||
logger = logging.getLogger(__name__)
|
||||
|
||||
_config_cache = None
|
||||
|
||||
|
||||
def _config() -> dict:
|
||||
global _config_cache
|
||||
if _config_cache is None:
|
||||
with open('config.json') as f:
|
||||
_config_cache = json.load(f)['database']
|
||||
return _config_cache
|
||||
|
||||
|
||||
@contextmanager
|
||||
def get_conn():
|
||||
cfg = _config()
|
||||
conn = pymysql.connect(
|
||||
host=cfg['host'],
|
||||
port=cfg.get('port', 3306),
|
||||
user=cfg['user'],
|
||||
password=cfg['password'],
|
||||
database=cfg['name'],
|
||||
autocommit=True,
|
||||
cursorclass=pymysql.cursors.DictCursor,
|
||||
connect_timeout=10,
|
||||
charset='utf8mb4',
|
||||
)
|
||||
try:
|
||||
yield conn
|
||||
finally:
|
||||
conn.close()
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Monitor state (key/value store)
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def set_state(key: str, value) -> None:
|
||||
if not isinstance(value, str):
|
||||
value = json.dumps(value, default=str)
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
"""INSERT INTO monitor_state (key_name, value)
|
||||
VALUES (%s, %s)
|
||||
ON DUPLICATE KEY UPDATE value=VALUES(value), updated_at=NOW()""",
|
||||
(key, value),
|
||||
)
|
||||
|
||||
|
||||
def get_state(key: str, default=None):
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute('SELECT value FROM monitor_state WHERE key_name=%s', (key,))
|
||||
row = cur.fetchone()
|
||||
return row['value'] if row else default
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Interface baseline tracking
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def get_baseline() -> dict:
|
||||
raw = get_state('interface_baseline')
|
||||
if raw:
|
||||
try:
|
||||
return json.loads(raw)
|
||||
except Exception:
|
||||
pass
|
||||
return {}
|
||||
|
||||
|
||||
def set_baseline(baseline: dict) -> None:
|
||||
set_state('interface_baseline', json.dumps(baseline))
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Network events
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def upsert_event(
|
||||
event_type: str,
|
||||
severity: str,
|
||||
source_type: str,
|
||||
target_name: str,
|
||||
target_detail: str,
|
||||
description: str,
|
||||
) -> tuple:
|
||||
"""Insert or update a network event. Returns (id, is_new, consecutive_failures)."""
|
||||
detail = target_detail or ''
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
"""SELECT id, consecutive_failures FROM network_events
|
||||
WHERE event_type=%s AND target_name=%s AND target_detail=%s
|
||||
AND resolved_at IS NULL LIMIT 1""",
|
||||
(event_type, target_name, detail),
|
||||
)
|
||||
existing = cur.fetchone()
|
||||
|
||||
if existing:
|
||||
new_count = existing['consecutive_failures'] + 1
|
||||
cur.execute(
|
||||
"""UPDATE network_events
|
||||
SET last_seen=NOW(), consecutive_failures=%s, description=%s
|
||||
WHERE id=%s""",
|
||||
(new_count, description, existing['id']),
|
||||
)
|
||||
return existing['id'], False, new_count
|
||||
else:
|
||||
cur.execute(
|
||||
"""INSERT INTO network_events
|
||||
(event_type, severity, source_type, target_name, target_detail, description)
|
||||
VALUES (%s, %s, %s, %s, %s, %s)""",
|
||||
(event_type, severity, source_type, target_name, detail, description),
|
||||
)
|
||||
return cur.lastrowid, True, 1
|
||||
|
||||
|
||||
def resolve_event(event_type: str, target_name: str, target_detail: str = '') -> None:
|
||||
detail = target_detail or ''
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
"""UPDATE network_events SET resolved_at=NOW()
|
||||
WHERE event_type=%s AND target_name=%s AND target_detail=%s
|
||||
AND resolved_at IS NULL""",
|
||||
(event_type, target_name, detail),
|
||||
)
|
||||
|
||||
|
||||
def set_ticket_id(event_id: int, ticket_id: str) -> None:
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
'UPDATE network_events SET ticket_id=%s WHERE id=%s',
|
||||
(ticket_id, event_id),
|
||||
)
|
||||
|
||||
|
||||
def get_active_events() -> list:
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
"""SELECT * FROM network_events
|
||||
WHERE resolved_at IS NULL
|
||||
ORDER BY
|
||||
FIELD(severity,'critical','warning','info'),
|
||||
first_seen DESC"""
|
||||
)
|
||||
rows = cur.fetchall()
|
||||
for r in rows:
|
||||
for k in ('first_seen', 'last_seen'):
|
||||
if r.get(k) and hasattr(r[k], 'isoformat'):
|
||||
r[k] = r[k].isoformat()
|
||||
return rows
|
||||
|
||||
|
||||
def get_recent_resolved(hours: int = 24, limit: int = 50) -> list:
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
"""SELECT * FROM network_events
|
||||
WHERE resolved_at IS NOT NULL
|
||||
AND resolved_at > DATE_SUB(NOW(), INTERVAL %s HOUR)
|
||||
ORDER BY resolved_at DESC LIMIT %s""",
|
||||
(hours, limit),
|
||||
)
|
||||
rows = cur.fetchall()
|
||||
for r in rows:
|
||||
for k in ('first_seen', 'last_seen', 'resolved_at'):
|
||||
if r.get(k) and hasattr(r[k], 'isoformat'):
|
||||
r[k] = r[k].isoformat()
|
||||
return rows
|
||||
|
||||
|
||||
def get_status_summary() -> dict:
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
"""SELECT severity, COUNT(*) as cnt FROM network_events
|
||||
WHERE resolved_at IS NULL GROUP BY severity"""
|
||||
)
|
||||
counts = {r['severity']: r['cnt'] for r in cur.fetchall()}
|
||||
return {
|
||||
'critical': counts.get('critical', 0),
|
||||
'warning': counts.get('warning', 0),
|
||||
'info': counts.get('info', 0),
|
||||
}
|
||||
|
||||
|
||||
# ---------------------------------------------------------------------------
|
||||
# Suppression rules
|
||||
# ---------------------------------------------------------------------------
|
||||
|
||||
def get_active_suppressions() -> list:
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
"""SELECT * FROM suppression_rules
|
||||
WHERE active=TRUE AND (expires_at IS NULL OR expires_at > NOW())
|
||||
ORDER BY created_at DESC"""
|
||||
)
|
||||
rows = cur.fetchall()
|
||||
for r in rows:
|
||||
for k in ('created_at', 'expires_at'):
|
||||
if r.get(k) and hasattr(r[k], 'isoformat'):
|
||||
r[k] = r[k].isoformat()
|
||||
return rows
|
||||
|
||||
|
||||
def get_suppression_history(limit: int = 50) -> list:
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
'SELECT * FROM suppression_rules ORDER BY created_at DESC LIMIT %s',
|
||||
(limit,),
|
||||
)
|
||||
rows = cur.fetchall()
|
||||
for r in rows:
|
||||
for k in ('created_at', 'expires_at'):
|
||||
if r.get(k) and hasattr(r[k], 'isoformat'):
|
||||
r[k] = r[k].isoformat()
|
||||
return rows
|
||||
|
||||
|
||||
def create_suppression(
|
||||
target_type: str,
|
||||
target_name: str,
|
||||
target_detail: str,
|
||||
reason: str,
|
||||
suppressed_by: str,
|
||||
expires_minutes: Optional[int] = None,
|
||||
) -> int:
|
||||
expires_at = None
|
||||
if expires_minutes:
|
||||
expires_at = datetime.utcnow() + timedelta(minutes=int(expires_minutes))
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
"""INSERT INTO suppression_rules
|
||||
(target_type, target_name, target_detail, reason, suppressed_by, expires_at, active)
|
||||
VALUES (%s, %s, %s, %s, %s, %s, TRUE)""",
|
||||
(target_type, target_name or '', target_detail or '', reason, suppressed_by, expires_at),
|
||||
)
|
||||
return cur.lastrowid
|
||||
|
||||
|
||||
def deactivate_suppression(sup_id: int) -> None:
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
cur.execute(
|
||||
'UPDATE suppression_rules SET active=FALSE WHERE id=%s', (sup_id,)
|
||||
)
|
||||
|
||||
|
||||
def is_suppressed(target_type: str, target_name: str, target_detail: str = '') -> bool:
|
||||
with get_conn() as conn:
|
||||
with conn.cursor() as cur:
|
||||
# Global suppression (all)
|
||||
cur.execute(
|
||||
"""SELECT id FROM suppression_rules
|
||||
WHERE active=TRUE AND (expires_at IS NULL OR expires_at > NOW())
|
||||
AND target_type='all' LIMIT 1"""
|
||||
)
|
||||
if cur.fetchone():
|
||||
return True
|
||||
|
||||
if not target_name:
|
||||
return False
|
||||
|
||||
# Host-level suppression (covers all interfaces on that host)
|
||||
cur.execute(
|
||||
"""SELECT id FROM suppression_rules
|
||||
WHERE active=TRUE AND (expires_at IS NULL OR expires_at > NOW())
|
||||
AND target_type=%s AND target_name=%s
|
||||
AND (target_detail IS NULL OR target_detail='') LIMIT 1""",
|
||||
(target_type, target_name),
|
||||
)
|
||||
if cur.fetchone():
|
||||
return True
|
||||
|
||||
# Interface/device-specific suppression
|
||||
if target_detail:
|
||||
cur.execute(
|
||||
"""SELECT id FROM suppression_rules
|
||||
WHERE active=TRUE AND (expires_at IS NULL OR expires_at > NOW())
|
||||
AND target_type=%s AND target_name=%s AND target_detail=%s LIMIT 1""",
|
||||
(target_type, target_name, target_detail),
|
||||
)
|
||||
if cur.fetchone():
|
||||
return True
|
||||
|
||||
return False
|
||||
22
gandalf-monitor.service
Normal file
22
gandalf-monitor.service
Normal file
@@ -0,0 +1,22 @@
|
||||
[Unit]
|
||||
Description=Gandalf Network Monitor Daemon
|
||||
Documentation=https://gitea.lotusguild.org/LotusGuild/gandalf
|
||||
After=network.target
|
||||
Wants=network-online.target
|
||||
|
||||
[Service]
|
||||
Type=simple
|
||||
User=www-data
|
||||
WorkingDirectory=/var/www/html/prod
|
||||
ExecStart=/usr/bin/python3 /var/www/html/prod/monitor.py
|
||||
Restart=on-failure
|
||||
RestartSec=30
|
||||
TimeoutStopSec=10
|
||||
|
||||
# Logging
|
||||
StandardOutput=journal
|
||||
StandardError=journal
|
||||
SyslogIdentifier=gandalf-monitor
|
||||
|
||||
[Install]
|
||||
WantedBy=multi-user.target
|
||||
479
monitor.py
Normal file
479
monitor.py
Normal file
@@ -0,0 +1,479 @@
|
||||
#!/usr/bin/env python3
|
||||
"""Gandalf network monitor daemon.
|
||||
|
||||
Polls Prometheus (node_exporter) and the UniFi controller for network
|
||||
interface and device state. Creates tickets in Tinker Tickets when issues
|
||||
are detected, with deduplication and suppression support.
|
||||
|
||||
Run as a separate systemd service alongside the Flask web app.
|
||||
"""
|
||||
import json
|
||||
import logging
|
||||
import re
|
||||
import subprocess
|
||||
import time
|
||||
from datetime import datetime
|
||||
from typing import Dict, List, Optional
|
||||
|
||||
import requests
|
||||
from urllib3.exceptions import InsecureRequestWarning
|
||||
|
||||
import db
|
||||
|
||||
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)
|
||||
|
||||
logging.basicConfig(
|
||||
level=logging.INFO,
|
||||
format='%(asctime)s %(levelname)s %(name)s %(message)s',
|
||||
)
|
||||
logger = logging.getLogger('gandalf.monitor')
|
||||
|
||||
# --------------------------------------------------------------------------
|
||||
# Interface filtering
|
||||
# --------------------------------------------------------------------------
|
||||
_SKIP_PREFIXES = (
|
||||
'lo', 'veth', 'tap', 'fwbr', 'fwln', 'fwpr',
|
||||
'docker', 'dummy', 'br-', 'virbr', 'vmbr',
|
||||
)
|
||||
_VLAN_SUFFIX = re.compile(r'\.\d+$')
|
||||
|
||||
|
||||
def is_physical_interface(name: str) -> bool:
|
||||
"""Return True for physical/bond interfaces worth monitoring."""
|
||||
if any(name.startswith(p) for p in _SKIP_PREFIXES):
|
||||
return False
|
||||
if _VLAN_SUFFIX.search(name):
|
||||
return False
|
||||
return True
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------
|
||||
# Prometheus client
|
||||
# --------------------------------------------------------------------------
|
||||
class PrometheusClient:
|
||||
def __init__(self, url: str):
|
||||
self.url = url.rstrip('/')
|
||||
|
||||
def query(self, promql: str) -> list:
|
||||
try:
|
||||
resp = requests.get(
|
||||
f'{self.url}/api/v1/query',
|
||||
params={'query': promql},
|
||||
timeout=15,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
if data.get('status') == 'success':
|
||||
return data['data']['result']
|
||||
except Exception as e:
|
||||
logger.error(f'Prometheus query failed ({promql!r}): {e}')
|
||||
return []
|
||||
|
||||
def get_interface_states(self) -> Dict[str, Dict[str, bool]]:
|
||||
"""Return {instance: {device: is_up}} for physical interfaces."""
|
||||
results = self.query('node_network_up')
|
||||
hosts: Dict[str, Dict[str, bool]] = {}
|
||||
for r in results:
|
||||
instance = r['metric'].get('instance', '')
|
||||
device = r['metric'].get('device', '')
|
||||
if not is_physical_interface(device):
|
||||
continue
|
||||
hosts.setdefault(instance, {})[device] = (r['value'][1] == '1')
|
||||
return hosts
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------
|
||||
# UniFi client
|
||||
# --------------------------------------------------------------------------
|
||||
class UnifiClient:
|
||||
def __init__(self, cfg: dict):
|
||||
self.base_url = cfg['controller']
|
||||
self.site_id = cfg.get('site_id', 'default')
|
||||
self.session = requests.Session()
|
||||
self.session.verify = False
|
||||
self.headers = {
|
||||
'X-API-KEY': cfg['api_key'],
|
||||
'Accept': 'application/json',
|
||||
}
|
||||
|
||||
def get_devices(self) -> Optional[List[dict]]:
|
||||
"""Return list of UniFi devices, or None if the controller is unreachable."""
|
||||
try:
|
||||
url = f'{self.base_url}/proxy/network/v2/api/site/{self.site_id}/device'
|
||||
resp = self.session.get(url, headers=self.headers, timeout=15)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
devices = []
|
||||
for d in data.get('network_devices', []):
|
||||
state = d.get('state', 1)
|
||||
devices.append({
|
||||
'name': d.get('name') or d.get('mac', 'unknown'),
|
||||
'mac': d.get('mac', ''),
|
||||
'ip': d.get('ip', ''),
|
||||
'type': d.get('type', 'unknown'),
|
||||
'model': d.get('model', ''),
|
||||
'state': state,
|
||||
'connected': state == 1,
|
||||
})
|
||||
return devices
|
||||
except Exception as e:
|
||||
logger.error(f'UniFi API error: {e}')
|
||||
return None
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------
|
||||
# Ticket client
|
||||
# --------------------------------------------------------------------------
|
||||
class TicketClient:
|
||||
def __init__(self, cfg: dict):
|
||||
self.url = cfg.get('url', '')
|
||||
self.api_key = cfg.get('api_key', '')
|
||||
|
||||
def create(self, title: str, description: str, priority: str = '2') -> Optional[str]:
|
||||
if not self.api_key or not self.url:
|
||||
logger.warning('Ticket API not configured – skipping ticket creation')
|
||||
return None
|
||||
try:
|
||||
resp = requests.post(
|
||||
self.url,
|
||||
json={
|
||||
'title': title,
|
||||
'description': description,
|
||||
'status': 'Open',
|
||||
'priority': priority,
|
||||
'category': 'Network',
|
||||
'type': 'Issue',
|
||||
},
|
||||
headers={'Authorization': f'Bearer {self.api_key}'},
|
||||
timeout=15,
|
||||
)
|
||||
resp.raise_for_status()
|
||||
data = resp.json()
|
||||
if data.get('success'):
|
||||
tid = data['ticket_id']
|
||||
logger.info(f'Created ticket #{tid}: {title}')
|
||||
return tid
|
||||
if data.get('existing_ticket_id'):
|
||||
logger.info(f'Duplicate suppressed by API – existing #{data["existing_ticket_id"]}')
|
||||
return data['existing_ticket_id']
|
||||
logger.warning(f'Unexpected ticket API response: {data}')
|
||||
except Exception as e:
|
||||
logger.error(f'Ticket creation failed: {e}')
|
||||
return None
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------
|
||||
# Helpers
|
||||
# --------------------------------------------------------------------------
|
||||
def ping(ip: str, count: int = 3, timeout: int = 2) -> bool:
|
||||
try:
|
||||
r = subprocess.run(
|
||||
['ping', '-c', str(count), '-W', str(timeout), ip],
|
||||
stdout=subprocess.DEVNULL,
|
||||
stderr=subprocess.DEVNULL,
|
||||
timeout=30,
|
||||
)
|
||||
return r.returncode == 0
|
||||
except Exception:
|
||||
return False
|
||||
|
||||
|
||||
def _now_utc() -> str:
|
||||
return datetime.utcnow().strftime('%Y-%m-%d %H:%M:%S UTC')
|
||||
|
||||
|
||||
# --------------------------------------------------------------------------
|
||||
# Monitor
|
||||
# --------------------------------------------------------------------------
|
||||
CLUSTER_NAME = 'proxmox-cluster'
|
||||
|
||||
|
||||
class NetworkMonitor:
|
||||
def __init__(self):
|
||||
with open('config.json') as f:
|
||||
self.cfg = json.load(f)
|
||||
|
||||
prom_url = self.cfg['prometheus']['url']
|
||||
self.prom = PrometheusClient(prom_url)
|
||||
self.unifi = UnifiClient(self.cfg['unifi'])
|
||||
self.tickets = TicketClient(self.cfg.get('ticket_api', {}))
|
||||
|
||||
mon = self.cfg.get('monitor', {})
|
||||
self.poll_interval = mon.get('poll_interval', 120)
|
||||
self.fail_thresh = mon.get('failure_threshold', 2)
|
||||
self.cluster_thresh = mon.get('cluster_threshold', 3)
|
||||
|
||||
# Build Prometheus instance → hostname lookup
|
||||
self._instance_map: Dict[str, str] = {
|
||||
h['prometheus_instance']: h['name']
|
||||
for h in self.cfg.get('hosts', [])
|
||||
if 'prometheus_instance' in h
|
||||
}
|
||||
|
||||
def _hostname(self, instance: str) -> str:
|
||||
return self._instance_map.get(instance, instance.split(':')[0])
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Interface monitoring (Prometheus)
|
||||
# ------------------------------------------------------------------
|
||||
def _process_interfaces(self, states: Dict[str, Dict[str, bool]]) -> None:
|
||||
baseline = db.get_baseline()
|
||||
new_baseline = {k: dict(v) for k, v in baseline.items()}
|
||||
# Only count hosts with genuine regressions (UP→DOWN) toward cluster threshold
|
||||
hosts_with_regression: List[str] = []
|
||||
|
||||
for instance, ifaces in states.items():
|
||||
host = self._hostname(instance)
|
||||
new_baseline.setdefault(host, {})
|
||||
host_has_regression = False
|
||||
|
||||
for iface, is_up in ifaces.items():
|
||||
prev = baseline.get(host, {}).get(iface) # 'up', 'initial_down', or None
|
||||
|
||||
if is_up:
|
||||
new_baseline[host][iface] = 'up'
|
||||
db.resolve_event('interface_down', host, iface)
|
||||
else:
|
||||
if prev is None:
|
||||
# First observation is down – could be unused port, don't alert
|
||||
new_baseline[host][iface] = 'initial_down'
|
||||
|
||||
elif prev == 'initial_down':
|
||||
# Persistently down since first observation – no alert
|
||||
pass
|
||||
|
||||
else: # prev == 'up'
|
||||
# Regression: was UP, now DOWN
|
||||
host_has_regression = True
|
||||
sup = (
|
||||
db.is_suppressed('interface', host, iface) or
|
||||
db.is_suppressed('host', host)
|
||||
)
|
||||
event_id, is_new, consec = db.upsert_event(
|
||||
'interface_down', 'critical', 'prometheus',
|
||||
host, iface,
|
||||
f'Interface {iface} on {host} went link-down ({_now_utc()})',
|
||||
)
|
||||
if not sup and consec >= self.fail_thresh:
|
||||
self._ticket_interface(event_id, is_new, host, iface, consec)
|
||||
|
||||
if host_has_regression:
|
||||
hosts_with_regression.append(host)
|
||||
|
||||
db.set_baseline(new_baseline)
|
||||
|
||||
# Cluster-wide check – only genuine regressions count
|
||||
if len(hosts_with_regression) >= self.cluster_thresh:
|
||||
sup = db.is_suppressed('all', '')
|
||||
event_id, is_new, consec = db.upsert_event(
|
||||
'cluster_network_issue', 'critical', 'prometheus',
|
||||
CLUSTER_NAME, '',
|
||||
f'{len(hosts_with_regression)} hosts reporting simultaneous interface failures: '
|
||||
f'{", ".join(hosts_with_regression)}',
|
||||
)
|
||||
if not sup and is_new:
|
||||
title = (
|
||||
f'[{CLUSTER_NAME}][auto][production][issue][network][cluster-wide] '
|
||||
f'Multiple hosts reporting interface failures'
|
||||
)
|
||||
desc = (
|
||||
f'Cluster Network Alert\n{"=" * 40}\n\n'
|
||||
f'Affected hosts: {", ".join(hosts_with_regression)}\n'
|
||||
f'Detected: {_now_utc()}\n\n'
|
||||
f'{len(hosts_with_regression)} Proxmox hosts simultaneously reported '
|
||||
f'interface regressions (link-down on interfaces previously known UP).\n'
|
||||
f'This likely indicates a switch or upstream network failure.\n\n'
|
||||
f'Please check the core and management switches immediately.'
|
||||
)
|
||||
tid = self.tickets.create(title, desc, priority='1')
|
||||
if tid:
|
||||
db.set_ticket_id(event_id, tid)
|
||||
else:
|
||||
db.resolve_event('cluster_network_issue', CLUSTER_NAME, '')
|
||||
|
||||
def _ticket_interface(
|
||||
self, event_id: int, is_new: bool, host: str, iface: str, consec: int
|
||||
) -> None:
|
||||
title = (
|
||||
f'[{host}][auto][production][issue][network][single-node] '
|
||||
f'Interface {iface} link-down'
|
||||
)
|
||||
desc = (
|
||||
f'Network Interface Alert\n{"=" * 40}\n\n'
|
||||
f'Host: {host}\n'
|
||||
f'Interface: {iface}\n'
|
||||
f'Detected: {_now_utc()}\n'
|
||||
f'Consecutive check failures: {consec}\n\n'
|
||||
f'Interface {iface} on {host} is reporting link-down state via '
|
||||
f'Prometheus node_exporter.\n\n'
|
||||
f'Note: {host} may still be reachable via its other network interface.\n'
|
||||
f'Please inspect the cable/SFP/switch port for {host}/{iface}.'
|
||||
)
|
||||
tid = self.tickets.create(title, desc, priority='2')
|
||||
if tid and is_new:
|
||||
db.set_ticket_id(event_id, tid)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# UniFi device monitoring
|
||||
# ------------------------------------------------------------------
|
||||
def _process_unifi(self, devices: Optional[List[dict]]) -> None:
|
||||
if devices is None:
|
||||
logger.warning('UniFi API unreachable this cycle')
|
||||
return
|
||||
|
||||
for d in devices:
|
||||
name = d['name']
|
||||
if not d['connected']:
|
||||
sup = db.is_suppressed('unifi_device', name)
|
||||
event_id, is_new, consec = db.upsert_event(
|
||||
'unifi_device_offline', 'critical', 'unifi',
|
||||
name, d.get('type', ''),
|
||||
f'UniFi {name} ({d.get("ip","")}) offline ({_now_utc()})',
|
||||
)
|
||||
if not sup and consec >= self.fail_thresh:
|
||||
self._ticket_unifi(event_id, is_new, d)
|
||||
else:
|
||||
db.resolve_event('unifi_device_offline', name, d.get('type', ''))
|
||||
|
||||
def _ticket_unifi(self, event_id: int, is_new: bool, device: dict) -> None:
|
||||
name = device['name']
|
||||
title = (
|
||||
f'[{name}][auto][production][issue][network][single-node] '
|
||||
f'UniFi device offline'
|
||||
)
|
||||
desc = (
|
||||
f'UniFi Device Alert\n{"=" * 40}\n\n'
|
||||
f'Device: {name}\n'
|
||||
f'Type: {device.get("type","unknown")}\n'
|
||||
f'Model: {device.get("model","")}\n'
|
||||
f'Last Known IP: {device.get("ip","unknown")}\n'
|
||||
f'Detected: {_now_utc()}\n\n'
|
||||
f'The UniFi device {name} is offline per the UniFi controller.\n'
|
||||
f'Please check power and cable connectivity.'
|
||||
)
|
||||
tid = self.tickets.create(title, desc, priority='2')
|
||||
if tid and is_new:
|
||||
db.set_ticket_id(event_id, tid)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Ping-only hosts (no node_exporter)
|
||||
# ------------------------------------------------------------------
|
||||
def _process_ping_hosts(self) -> None:
|
||||
for h in self.cfg.get('monitor', {}).get('ping_hosts', []):
|
||||
name, ip = h['name'], h['ip']
|
||||
reachable = ping(ip)
|
||||
|
||||
if not reachable:
|
||||
sup = db.is_suppressed('host', name)
|
||||
event_id, is_new, consec = db.upsert_event(
|
||||
'host_unreachable', 'critical', 'ping',
|
||||
name, ip,
|
||||
f'Host {name} ({ip}) unreachable via ping ({_now_utc()})',
|
||||
)
|
||||
if not sup and consec >= self.fail_thresh:
|
||||
self._ticket_unreachable(event_id, is_new, name, ip, consec)
|
||||
else:
|
||||
db.resolve_event('host_unreachable', name, ip)
|
||||
|
||||
def _ticket_unreachable(
|
||||
self, event_id: int, is_new: bool, name: str, ip: str, consec: int
|
||||
) -> None:
|
||||
title = (
|
||||
f'[{name}][auto][production][issue][network][single-node] '
|
||||
f'Host unreachable'
|
||||
)
|
||||
desc = (
|
||||
f'Host Reachability Alert\n{"=" * 40}\n\n'
|
||||
f'Host: {name}\n'
|
||||
f'IP: {ip}\n'
|
||||
f'Detected: {_now_utc()}\n'
|
||||
f'Consecutive check failures: {consec}\n\n'
|
||||
f'Host {name} ({ip}) is not responding to ping from the Gandalf monitor.\n'
|
||||
f'This host does not have a Prometheus node_exporter, so interface-level '
|
||||
f'detail is unavailable.\n\n'
|
||||
f'Please check the host power, management interface, and network connectivity.'
|
||||
)
|
||||
tid = self.tickets.create(title, desc, priority='2')
|
||||
if tid and is_new:
|
||||
db.set_ticket_id(event_id, tid)
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Snapshot collection (for dashboard)
|
||||
# ------------------------------------------------------------------
|
||||
def _collect_snapshot(self) -> dict:
|
||||
iface_states = self.prom.get_interface_states()
|
||||
unifi_devices = self.unifi.get_devices() or []
|
||||
|
||||
hosts = {}
|
||||
for instance, ifaces in iface_states.items():
|
||||
host = self._hostname(instance)
|
||||
phys = {k: v for k, v in ifaces.items()}
|
||||
up_count = sum(1 for v in phys.values() if v)
|
||||
total = len(phys)
|
||||
if total == 0 or up_count == total:
|
||||
status = 'up'
|
||||
elif up_count == 0:
|
||||
status = 'down'
|
||||
else:
|
||||
status = 'degraded'
|
||||
|
||||
hosts[host] = {
|
||||
'ip': instance.split(':')[0],
|
||||
'interfaces': {k: ('up' if v else 'down') for k, v in phys.items()},
|
||||
'status': status,
|
||||
'source': 'prometheus',
|
||||
}
|
||||
|
||||
for h in self.cfg.get('monitor', {}).get('ping_hosts', []):
|
||||
name, ip = h['name'], h['ip']
|
||||
reachable = ping(ip, count=1, timeout=2)
|
||||
hosts[name] = {
|
||||
'ip': ip,
|
||||
'interfaces': {},
|
||||
'status': 'up' if reachable else 'down',
|
||||
'source': 'ping',
|
||||
}
|
||||
|
||||
return {
|
||||
'hosts': hosts,
|
||||
'unifi': unifi_devices,
|
||||
'updated': datetime.utcnow().isoformat(),
|
||||
}
|
||||
|
||||
# ------------------------------------------------------------------
|
||||
# Main loop
|
||||
# ------------------------------------------------------------------
|
||||
def run(self) -> None:
|
||||
logger.info(
|
||||
f'Gandalf monitor started – poll_interval={self.poll_interval}s '
|
||||
f'fail_thresh={self.fail_thresh}'
|
||||
)
|
||||
while True:
|
||||
try:
|
||||
logger.info('Starting network check cycle')
|
||||
|
||||
# 1. Collect and store snapshot for dashboard
|
||||
snapshot = self._collect_snapshot()
|
||||
db.set_state('network_snapshot', snapshot)
|
||||
db.set_state('last_check', _now_utc())
|
||||
|
||||
# 2. Process alerts (separate Prometheus call for fresh data)
|
||||
iface_states = self.prom.get_interface_states()
|
||||
self._process_interfaces(iface_states)
|
||||
|
||||
unifi_devices = self.unifi.get_devices()
|
||||
self._process_unifi(unifi_devices)
|
||||
|
||||
self._process_ping_hosts()
|
||||
|
||||
logger.info('Network check cycle complete')
|
||||
|
||||
except Exception as e:
|
||||
logger.error(f'Monitor loop error: {e}', exc_info=True)
|
||||
|
||||
time.sleep(self.poll_interval)
|
||||
|
||||
|
||||
if __name__ == '__main__':
|
||||
monitor = NetworkMonitor()
|
||||
monitor.run()
|
||||
5
requirements.txt
Normal file
5
requirements.txt
Normal file
@@ -0,0 +1,5 @@
|
||||
flask>=2.2.0
|
||||
gunicorn>=20.1.0
|
||||
pymysql>=1.1.0
|
||||
requests>=2.31.0
|
||||
urllib3>=2.0.0
|
||||
50
schema.sql
Normal file
50
schema.sql
Normal file
@@ -0,0 +1,50 @@
|
||||
-- Gandalf Network Monitor – Database Schema
|
||||
-- Run on MariaDB LXC 149 (10.10.10.50)
|
||||
|
||||
CREATE DATABASE IF NOT EXISTS gandalf
|
||||
CHARACTER SET utf8mb4
|
||||
COLLATE utf8mb4_unicode_ci;
|
||||
|
||||
USE gandalf;
|
||||
|
||||
-- ── Network events (open and resolved alerts) ─────────────────────────
|
||||
CREATE TABLE IF NOT EXISTS network_events (
|
||||
id INT AUTO_INCREMENT PRIMARY KEY,
|
||||
event_type VARCHAR(60) NOT NULL,
|
||||
severity ENUM('critical','warning','info') NOT NULL DEFAULT 'warning',
|
||||
source_type VARCHAR(20) NOT NULL, -- 'prometheus', 'unifi', 'ping'
|
||||
target_name VARCHAR(255) NOT NULL, -- hostname or device name
|
||||
target_detail VARCHAR(255) NOT NULL DEFAULT '', -- interface name, device type, IP
|
||||
description TEXT,
|
||||
first_seen TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
last_seen TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
|
||||
resolved_at TIMESTAMP NULL,
|
||||
consecutive_failures INT NOT NULL DEFAULT 1,
|
||||
ticket_id VARCHAR(20) NULL,
|
||||
|
||||
INDEX idx_active (resolved_at),
|
||||
INDEX idx_target (target_name, target_detail),
|
||||
INDEX idx_type (event_type)
|
||||
) ENGINE=InnoDB;
|
||||
|
||||
-- ── Suppression rules ─────────────────────────────────────────────────
|
||||
CREATE TABLE IF NOT EXISTS suppression_rules (
|
||||
id INT AUTO_INCREMENT PRIMARY KEY,
|
||||
target_type VARCHAR(50) NOT NULL, -- 'host', 'interface', 'unifi_device', 'all'
|
||||
target_name VARCHAR(255) NOT NULL DEFAULT '',
|
||||
target_detail VARCHAR(255) NOT NULL DEFAULT '',
|
||||
reason TEXT NOT NULL,
|
||||
suppressed_by VARCHAR(255) NOT NULL,
|
||||
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
|
||||
expires_at TIMESTAMP NULL, -- NULL = manual (never auto-expires)
|
||||
active BOOLEAN NOT NULL DEFAULT TRUE,
|
||||
|
||||
INDEX idx_active_exp (active, expires_at)
|
||||
) ENGINE=InnoDB;
|
||||
|
||||
-- ── Monitor state (key/value store for snapshot + baseline) ───────────
|
||||
CREATE TABLE IF NOT EXISTS monitor_state (
|
||||
key_name VARCHAR(100) PRIMARY KEY,
|
||||
value MEDIUMTEXT NOT NULL,
|
||||
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP
|
||||
) ENGINE=InnoDB;
|
||||
371
static/app.js
371
static/app.js
@@ -1,147 +1,272 @@
|
||||
// Initialization
|
||||
const UPDATE_INTERVALS = {
|
||||
deviceStatus: 30000,
|
||||
diagnostics: 60000
|
||||
};
|
||||
'use strict';
|
||||
|
||||
// Core update functions
|
||||
function updateDeviceStatus() {
|
||||
console.log('Fetching device status...');
|
||||
fetch('/api/status')
|
||||
.then(response => response.json())
|
||||
.then(data => {
|
||||
console.log('Received status data:', data);
|
||||
Object.entries(data).forEach(([deviceName, status]) => {
|
||||
const deviceElement = document.querySelector(`.device-status[data-device-name="${deviceName}"]`);
|
||||
if (deviceElement) {
|
||||
const indicator = deviceElement.querySelector('.status-indicator');
|
||||
indicator.className = `status-indicator status-${status ? 'up' : 'down'}`;
|
||||
// ── Toast notifications ───────────────────────────────────────────────
|
||||
function showToast(msg, type = 'success') {
|
||||
let container = document.querySelector('.toast-container');
|
||||
if (!container) {
|
||||
container = document.createElement('div');
|
||||
container.className = 'toast-container';
|
||||
document.body.appendChild(container);
|
||||
}
|
||||
});
|
||||
});
|
||||
const toast = document.createElement('div');
|
||||
toast.className = `toast toast-${type}`;
|
||||
toast.textContent = msg;
|
||||
container.appendChild(toast);
|
||||
setTimeout(() => toast.remove(), 3500);
|
||||
}
|
||||
|
||||
function toggleInterfaces(header) {
|
||||
const list = header.nextElementSibling;
|
||||
const icon = header.querySelector('.expand-icon');
|
||||
list.classList.toggle('collapsed');
|
||||
icon.style.transform = list.classList.contains('collapsed') ? 'rotate(-90deg)' : 'rotate(0deg)';
|
||||
// ── Dashboard auto-refresh ────────────────────────────────────────────
|
||||
async function refreshAll() {
|
||||
try {
|
||||
const [netResp, statusResp] = await Promise.all([
|
||||
fetch('/api/network'),
|
||||
fetch('/api/status'),
|
||||
]);
|
||||
if (!netResp.ok || !statusResp.ok) return;
|
||||
|
||||
const net = await netResp.json();
|
||||
const status = await statusResp.json();
|
||||
|
||||
updateHostGrid(net.hosts || {});
|
||||
updateUnifiTable(net.unifi || []);
|
||||
updateEventsTable(status.events || []);
|
||||
updateStatusBar(status.summary || {}, status.last_check || '');
|
||||
updateTopology(net.hosts || {});
|
||||
|
||||
} catch (e) {
|
||||
console.warn('Refresh failed:', e);
|
||||
}
|
||||
}
|
||||
|
||||
function updateInterfaceStatus(deviceName, interfaces) {
|
||||
const interfaceList = document.querySelector(`.interface-group[data-device-name="${deviceName}"] .interface-list`);
|
||||
if (interfaceList && interfaces) {
|
||||
interfaceList.innerHTML = '';
|
||||
Object.entries(interfaces.ports || {}).forEach(([portName, port]) => {
|
||||
interfaceList.innerHTML += `
|
||||
<div class="interface-item">
|
||||
<span class="port-name">${portName}</span>
|
||||
<span class="port-speed">${port.speed.current}/${port.speed.max} Mbps</span>
|
||||
<span class="port-status ${port.state}">${port.state}</span>
|
||||
function updateStatusBar(summary, lastCheck) {
|
||||
const bar = document.querySelector('.status-chips');
|
||||
if (!bar) return;
|
||||
const chips = [];
|
||||
if (summary.critical) chips.push(`<span class="chip chip-critical">⬤ ${summary.critical} Critical</span>`);
|
||||
if (summary.warning) chips.push(`<span class="chip chip-warning">⬤ ${summary.warning} Warning</span>`);
|
||||
if (!summary.critical && !summary.warning) chips.push('<span class="chip chip-ok">✔ All systems nominal</span>');
|
||||
bar.innerHTML = chips.join('');
|
||||
|
||||
const lc = document.getElementById('last-check');
|
||||
if (lc && lastCheck) lc.textContent = `Last check: ${lastCheck}`;
|
||||
}
|
||||
|
||||
function updateHostGrid(hosts) {
|
||||
for (const [name, host] of Object.entries(hosts)) {
|
||||
const card = document.querySelector(`.host-card[data-host="${CSS.escape(name)}"]`);
|
||||
if (!card) continue;
|
||||
|
||||
// Update card border class
|
||||
card.className = card.className.replace(/host-card-(up|down|degraded|unknown)/g, '');
|
||||
card.classList.add(`host-card-${host.status}`);
|
||||
|
||||
// Update status dot in header
|
||||
const dot = card.querySelector('.host-status-dot');
|
||||
if (dot) dot.className = `host-status-dot dot-${host.status}`;
|
||||
|
||||
// Update interface rows
|
||||
const ifaceList = card.querySelector('.iface-list');
|
||||
if (ifaceList && host.interfaces && Object.keys(host.interfaces).length > 0) {
|
||||
ifaceList.innerHTML = Object.entries(host.interfaces)
|
||||
.sort(([a], [b]) => a.localeCompare(b))
|
||||
.map(([iface, state]) => `
|
||||
<div class="iface-row">
|
||||
<span class="iface-dot dot-${state}"></span>
|
||||
<span class="iface-name">${escHtml(iface)}</span>
|
||||
<span class="iface-state state-${state}">${state}</span>
|
||||
</div>
|
||||
`;
|
||||
});
|
||||
`).join('');
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
function updateSystemHealth(deviceName, diagnostics) {
|
||||
const metricsContainer = document.querySelector(`.health-metrics[data-device-name="${deviceName}"] .metrics-list`);
|
||||
if (metricsContainer && diagnostics) {
|
||||
const cpu = metricsContainer.querySelector('.cpu');
|
||||
const memory = metricsContainer.querySelector('.memory');
|
||||
const temperature = metricsContainer.querySelector('.temperature');
|
||||
|
||||
cpu.innerHTML = `CPU: ${diagnostics.system?.cpu || 'N/A'}%`;
|
||||
memory.innerHTML = `Memory: ${diagnostics.system?.memory || 'N/A'}%`;
|
||||
temperature.innerHTML = `Temp: ${diagnostics.system?.temperature || 'N/A'}°C`;
|
||||
function updateTopology(hosts) {
|
||||
document.querySelectorAll('.topo-host').forEach(node => {
|
||||
const name = node.dataset.host;
|
||||
const host = hosts[name];
|
||||
if (!host) return;
|
||||
node.className = node.className.replace(/topo-status-(up|down|degraded|unknown)/g, '');
|
||||
node.classList.add(`topo-status-${host.status}`);
|
||||
const badge = node.querySelector('.topo-badge');
|
||||
if (badge) {
|
||||
badge.className = `topo-badge topo-badge-${host.status}`;
|
||||
badge.textContent = host.status;
|
||||
}
|
||||
}
|
||||
|
||||
function updateSystemMetrics() {
|
||||
fetch('/api/metrics')
|
||||
.then(response => response.json())
|
||||
.then(data => {
|
||||
updateInterfaceStatus(data.interfaces);
|
||||
updatePowerMetrics(data.power);
|
||||
updateSystemHealth(data.health);
|
||||
});
|
||||
}
|
||||
|
||||
//Metric updates like interfaces, power, and health
|
||||
function updateUnifiTable(devices) {
|
||||
const tbody = document.querySelector('#unifi-table tbody');
|
||||
if (!tbody || !devices.length) return;
|
||||
|
||||
function updateDiagnostics() {
|
||||
fetch('/api/diagnostics')
|
||||
.then(response => response.json())
|
||||
.then(data => {
|
||||
Object.entries(data).forEach(([deviceName, diagnostics]) => {
|
||||
updateInterfaceStatus(deviceName, diagnostics.interfaces);
|
||||
updateSystemHealth(deviceName, diagnostics);
|
||||
});
|
||||
});
|
||||
tbody.innerHTML = devices.map(d => {
|
||||
const statusClass = d.connected ? '' : 'row-critical';
|
||||
const dotClass = d.connected ? 'dot-up' : 'dot-down';
|
||||
const statusText = d.connected ? 'Online' : 'Offline';
|
||||
const suppressBtn = !d.connected
|
||||
? `<button class="btn-sm btn-suppress"
|
||||
onclick="openSuppressModal('unifi_device','${escHtml(d.name)}','')">🔕 Suppress</button>`
|
||||
: '';
|
||||
return `
|
||||
<tr class="${statusClass}">
|
||||
<td><span class="${dotClass}"></span> ${statusText}</td>
|
||||
<td><strong>${escHtml(d.name)}</strong></td>
|
||||
<td>${escHtml(d.type)}</td>
|
||||
<td>${escHtml(d.model)}</td>
|
||||
<td>${escHtml(d.ip)}</td>
|
||||
<td>${suppressBtn}</td>
|
||||
</tr>`;
|
||||
}).join('');
|
||||
}
|
||||
|
||||
// Element creation functions
|
||||
function createDiagnosticElement(device, diagnostics) {
|
||||
const element = document.createElement('div');
|
||||
element.className = `diagnostic-item ${diagnostics.connection_type}-diagnostic`;
|
||||
function updateEventsTable(events) {
|
||||
const wrap = document.getElementById('events-table-wrap');
|
||||
if (!wrap) return;
|
||||
|
||||
const content = `
|
||||
<h3>${device}</h3>
|
||||
<div class="diagnostic-details">
|
||||
<div class="status-group">
|
||||
<span class="label">Status:</span>
|
||||
<span class="value ${diagnostics.state.toLowerCase()}">${diagnostics.state}</span>
|
||||
</div>
|
||||
<div class="firmware-group">
|
||||
<span class="label">Firmware:</span>
|
||||
<span class="value">${diagnostics.firmware.version}</span>
|
||||
</div>
|
||||
${createInterfaceHTML(diagnostics.interfaces)}
|
||||
</div>
|
||||
`;
|
||||
|
||||
element.innerHTML = content;
|
||||
return element;
|
||||
const active = events.filter(e => e.severity !== 'info');
|
||||
if (!active.length) {
|
||||
wrap.innerHTML = '<p class="empty-state">No active alerts ✔</p>';
|
||||
return;
|
||||
}
|
||||
|
||||
function createInterfaceHTML(interfaces) {
|
||||
let html = '<div class="interfaces-group">';
|
||||
const rows = active.map(e => {
|
||||
const supType = e.event_type === 'unifi_device_offline' ? 'unifi_device'
|
||||
: e.event_type === 'interface_down' ? 'interface'
|
||||
: 'host';
|
||||
const ticket = e.ticket_id
|
||||
? `<a href="http://t.lotusguild.org/ticket/${e.ticket_id}" target="_blank"
|
||||
class="ticket-link">#${e.ticket_id}</a>`
|
||||
: '–';
|
||||
return `
|
||||
<tr class="row-${e.severity}">
|
||||
<td><span class="badge badge-${e.severity}">${e.severity}</span></td>
|
||||
<td>${escHtml(e.event_type.replace(/_/g,' '))}</td>
|
||||
<td><strong>${escHtml(e.target_name)}</strong></td>
|
||||
<td>${escHtml(e.target_detail || '–')}</td>
|
||||
<td class="desc-cell" title="${escHtml(e.description || '')}">${escHtml((e.description||'').substring(0,60))}${(e.description||'').length>60?'…':''}</td>
|
||||
<td class="ts-cell">${escHtml(e.first_seen||'')}</td>
|
||||
<td>${e.consecutive_failures}</td>
|
||||
<td>${ticket}</td>
|
||||
<td>
|
||||
<button class="btn-sm btn-suppress"
|
||||
onclick="openSuppressModal('${supType}','${escHtml(e.target_name)}','${escHtml(e.target_detail||'')}')">
|
||||
🔕
|
||||
</button>
|
||||
</td>
|
||||
</tr>`;
|
||||
}).join('');
|
||||
|
||||
// Add port information
|
||||
Object.entries(interfaces.ports || {}).forEach(([portName, port]) => {
|
||||
html += `
|
||||
<div class="interface-item">
|
||||
<span class="label">${portName}:</span>
|
||||
<span class="value">${port.speed.current}/${port.speed.max} Mbps</span>
|
||||
<span class="state ${port.state.toLowerCase()}">${port.state}</span>
|
||||
</div>
|
||||
`;
|
||||
wrap.innerHTML = `
|
||||
<table class="data-table" id="events-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Severity</th><th>Type</th><th>Target</th><th>Detail</th>
|
||||
<th>Description</th><th>First Seen</th><th>Failures</th><th>Ticket</th><th>Actions</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>${rows}</tbody>
|
||||
</table>`;
|
||||
}
|
||||
|
||||
// ── Suppression modal (dashboard) ────────────────────────────────────
|
||||
function openSuppressModal(type, name, detail) {
|
||||
const modal = document.getElementById('suppress-modal');
|
||||
if (!modal) return;
|
||||
|
||||
document.getElementById('sup-type').value = type;
|
||||
document.getElementById('sup-name').value = name;
|
||||
document.getElementById('sup-detail').value = detail;
|
||||
document.getElementById('sup-reason').value = '';
|
||||
document.getElementById('sup-expires').value = '';
|
||||
|
||||
updateSuppressForm();
|
||||
modal.style.display = 'flex';
|
||||
|
||||
document.querySelectorAll('#suppress-modal .pill').forEach(p => p.classList.remove('active'));
|
||||
const manualPill = document.querySelector('#suppress-modal .pill-manual');
|
||||
if (manualPill) manualPill.classList.add('active');
|
||||
const hint = document.getElementById('duration-hint');
|
||||
if (hint) hint.textContent = 'Suppression will persist until manually removed.';
|
||||
}
|
||||
|
||||
function closeSuppressModal() {
|
||||
const modal = document.getElementById('suppress-modal');
|
||||
if (modal) modal.style.display = 'none';
|
||||
}
|
||||
|
||||
function updateSuppressForm() {
|
||||
const type = document.getElementById('sup-type').value;
|
||||
const nameGrp = document.getElementById('sup-name-group');
|
||||
const detailGrp = document.getElementById('sup-detail-group');
|
||||
if (nameGrp) nameGrp.style.display = (type === 'all') ? 'none' : '';
|
||||
if (detailGrp) detailGrp.style.display = (type === 'interface') ? '' : 'none';
|
||||
}
|
||||
|
||||
function setDuration(mins) {
|
||||
document.getElementById('sup-expires').value = mins || '';
|
||||
|
||||
document.querySelectorAll('#suppress-modal .pill').forEach(p => p.classList.remove('active'));
|
||||
event.currentTarget.classList.add('active');
|
||||
|
||||
const hint = document.getElementById('duration-hint');
|
||||
if (hint) {
|
||||
if (mins) {
|
||||
const h = Math.floor(mins / 60), m = mins % 60;
|
||||
hint.textContent = `Expires in ${h ? h + 'h ' : ''}${m ? m + 'm' : ''}.`;
|
||||
} else {
|
||||
hint.textContent = 'Suppression will persist until manually removed.';
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
async function submitSuppress(e) {
|
||||
e.preventDefault();
|
||||
const type = document.getElementById('sup-type').value;
|
||||
const name = document.getElementById('sup-name').value;
|
||||
const detail = document.getElementById('sup-detail').value;
|
||||
const reason = document.getElementById('sup-reason').value;
|
||||
const expires = document.getElementById('sup-expires').value;
|
||||
|
||||
if (!reason.trim()) { showToast('Reason is required', 'error'); return; }
|
||||
if (type !== 'all' && !name.trim()) { showToast('Target name is required', 'error'); return; }
|
||||
|
||||
try {
|
||||
const resp = await fetch('/api/suppressions', {
|
||||
method: 'POST',
|
||||
headers: { 'Content-Type': 'application/json' },
|
||||
body: JSON.stringify({
|
||||
target_type: type,
|
||||
target_name: name,
|
||||
target_detail: detail,
|
||||
reason: reason,
|
||||
expires_minutes: expires ? parseInt(expires) : null,
|
||||
}),
|
||||
});
|
||||
const data = await resp.json();
|
||||
if (data.success) {
|
||||
closeSuppressModal();
|
||||
showToast('Suppression applied ✔', 'success');
|
||||
setTimeout(refreshAll, 500);
|
||||
} else {
|
||||
showToast(data.error || 'Failed to apply suppression', 'error');
|
||||
}
|
||||
} catch (err) {
|
||||
showToast('Network error', 'error');
|
||||
}
|
||||
}
|
||||
|
||||
// ── Close modal on backdrop click ─────────────────────────────────────
|
||||
document.addEventListener('click', e => {
|
||||
const modal = document.getElementById('suppress-modal');
|
||||
if (modal && e.target === modal) closeSuppressModal();
|
||||
});
|
||||
|
||||
// Add radio information
|
||||
Object.entries(interfaces.radios || {}).forEach(([radioName, radio]) => {
|
||||
html += `
|
||||
<div class="interface-item">
|
||||
<span class="label">${radioName}:</span>
|
||||
<span class="value">${radio.standard} - Ch${radio.channel} (${radio.width})</span>
|
||||
</div>
|
||||
`;
|
||||
});
|
||||
|
||||
html += '</div>';
|
||||
return html;
|
||||
// ── Utility ───────────────────────────────────────────────────────────
|
||||
function escHtml(str) {
|
||||
if (str === null || str === undefined) return '';
|
||||
return String(str)
|
||||
.replace(/&/g, '&')
|
||||
.replace(/</g, '<')
|
||||
.replace(/>/g, '>')
|
||||
.replace(/"/g, '"');
|
||||
}
|
||||
|
||||
// Initialize updates
|
||||
function initializeUpdates() {
|
||||
// Set update intervals
|
||||
setInterval(updateDeviceStatus, UPDATE_INTERVALS.deviceStatus);
|
||||
setInterval(updateDiagnostics, UPDATE_INTERVALS.diagnostics);
|
||||
|
||||
// Initial updates
|
||||
updateDeviceStatus();
|
||||
updateDiagnostics();
|
||||
}
|
||||
|
||||
// Start the application
|
||||
initializeUpdates();
|
||||
847
static/style.css
847
static/style.css
@@ -1,222 +1,747 @@
|
||||
/* ── Variables ──────────────────────────────────────────────────────── */
|
||||
:root {
|
||||
--primary-color: #006FFF;
|
||||
--secondary-color: #00439C;
|
||||
--background-color: #f8f9fa;
|
||||
--card-background: #ffffff;
|
||||
--text-color: #2c3e50;
|
||||
--border-radius: 12px;
|
||||
--blue: #006FFF;
|
||||
--blue-dark: #00439C;
|
||||
--blue-dim: rgba(0,111,255,.1);
|
||||
--green: #10B981;
|
||||
--red: #EF4444;
|
||||
--orange: #F59E0B;
|
||||
--yellow: #FBBF24;
|
||||
--grey: #6B7280;
|
||||
--grey-lt: #F3F4F6;
|
||||
--border: #E5E7EB;
|
||||
--text: #111827;
|
||||
--text-sub: #6B7280;
|
||||
--card-bg: #FFFFFF;
|
||||
--bg: #F8FAFC;
|
||||
--radius: 10px;
|
||||
--shadow: 0 1px 3px rgba(0,0,0,.08), 0 4px 12px rgba(0,0,0,.06);
|
||||
--font: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, sans-serif;
|
||||
--mono: 'SF Mono', 'Fira Code', Consolas, monospace;
|
||||
}
|
||||
|
||||
/* ── Reset ──────────────────────────────────────────────────────────── */
|
||||
*, *::before, *::after { box-sizing: border-box; margin: 0; padding: 0; }
|
||||
|
||||
body {
|
||||
font-family: 'Inter', -apple-system, sans-serif;
|
||||
background-color: var(--background-color);
|
||||
color: var(--text-color);
|
||||
margin: 0;
|
||||
padding: 0;
|
||||
font-family: var(--font);
|
||||
background: var(--bg);
|
||||
color: var(--text);
|
||||
font-size: 14px;
|
||||
line-height: 1.5;
|
||||
}
|
||||
|
||||
.container {
|
||||
max-width: 1400px;
|
||||
margin: 0 auto;
|
||||
padding: 20px;
|
||||
}
|
||||
a { color: var(--blue); text-decoration: none; }
|
||||
a:hover { text-decoration: underline; }
|
||||
|
||||
.header {
|
||||
background: linear-gradient(to right, var(--primary-color), var(--secondary-color));
|
||||
/* ── Navbar ─────────────────────────────────────────────────────────── */
|
||||
.navbar {
|
||||
background: linear-gradient(135deg, var(--blue-dark) 0%, var(--blue) 100%);
|
||||
color: white;
|
||||
padding: 20px;
|
||||
border-radius: var(--border-radius);
|
||||
margin-bottom: 30px;
|
||||
}
|
||||
|
||||
.metrics-container {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fit, minmax(350px, 1fr));
|
||||
gap: 25px;
|
||||
margin-top: 20px;
|
||||
}
|
||||
|
||||
.metric-card {
|
||||
background: var(--card-background);
|
||||
padding: 25px;
|
||||
border-radius: var(--border-radius);
|
||||
box-shadow: 0 4px 6px rgba(0,0,0,0.07);
|
||||
transition: transform 0.2s ease;
|
||||
}
|
||||
|
||||
.metric-card:hover {
|
||||
transform: translateY(-5px);
|
||||
}
|
||||
|
||||
.device-status {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 10px;
|
||||
margin: 10px 0;
|
||||
gap: 24px;
|
||||
padding: 0 24px;
|
||||
height: 56px;
|
||||
box-shadow: 0 2px 8px rgba(0,0,0,.2);
|
||||
}
|
||||
|
||||
.status-indicator {
|
||||
width: 12px;
|
||||
height: 12px;
|
||||
border-radius: 50%;
|
||||
}
|
||||
|
||||
.status-up {
|
||||
background-color: #10B981;
|
||||
}
|
||||
|
||||
.status-down {
|
||||
background-color: #EF4444;
|
||||
}
|
||||
|
||||
.diagnostics-panel {
|
||||
margin-top: 15px;
|
||||
}
|
||||
|
||||
.diagnostic-item {
|
||||
padding: 10px;
|
||||
border-left: 4px solid var(--primary-color);
|
||||
margin: 10px 0;
|
||||
background: rgba(0,111,255,0.1);
|
||||
}
|
||||
|
||||
.fiber-diagnostic {
|
||||
border-color: #10B981;
|
||||
}
|
||||
|
||||
.copper-diagnostic {
|
||||
border-color: #F59E0B;
|
||||
}
|
||||
|
||||
.device-info {
|
||||
.nav-brand {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 8px;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.nav-logo { font-size: 20px; }
|
||||
|
||||
.nav-title {
|
||||
font-weight: 700;
|
||||
font-size: 16px;
|
||||
letter-spacing: .05em;
|
||||
}
|
||||
|
||||
.nav-sub {
|
||||
font-size: 11px;
|
||||
opacity: .7;
|
||||
font-weight: 400;
|
||||
}
|
||||
|
||||
.nav-links {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 4px;
|
||||
flex: 1;
|
||||
}
|
||||
|
||||
.device-details {
|
||||
font-size: 0.8em;
|
||||
color: #666;
|
||||
.nav-link {
|
||||
color: rgba(255,255,255,.8);
|
||||
padding: 6px 14px;
|
||||
border-radius: 6px;
|
||||
font-size: 13px;
|
||||
transition: background .15s, color .15s;
|
||||
}
|
||||
|
||||
.diagnostic-details {
|
||||
display: grid;
|
||||
gap: 15px;
|
||||
padding: 10px;
|
||||
.nav-link:hover, .nav-link.active {
|
||||
background: rgba(255,255,255,.15);
|
||||
color: white;
|
||||
text-decoration: none;
|
||||
}
|
||||
|
||||
.status-group, .firmware-group, .interfaces-group {
|
||||
.nav-user {
|
||||
font-size: 12px;
|
||||
opacity: .8;
|
||||
}
|
||||
|
||||
/* ── Main layout ─────────────────────────────────────────────────────── */
|
||||
.main { max-width: 1400px; margin: 0 auto; padding: 24px 20px; }
|
||||
|
||||
.page-header { margin-bottom: 24px; }
|
||||
.page-title { font-size: 22px; font-weight: 700; }
|
||||
.page-sub { color: var(--text-sub); margin-top: 4px; }
|
||||
|
||||
/* ── Status bar ──────────────────────────────────────────────────────── */
|
||||
.status-bar {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
align-items: center;
|
||||
justify-content: space-between;
|
||||
gap: 16px;
|
||||
background: var(--card-bg);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius);
|
||||
padding: 12px 20px;
|
||||
margin-bottom: 24px;
|
||||
box-shadow: var(--shadow);
|
||||
}
|
||||
|
||||
.status-chips { display: flex; gap: 8px; flex-wrap: wrap; }
|
||||
|
||||
.chip {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 6px;
|
||||
padding: 5px 12px;
|
||||
border-radius: 20px;
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.chip-critical { background: rgba(239,68,68,.12); color: var(--red); border: 1px solid rgba(239,68,68,.3); }
|
||||
.chip-warning { background: rgba(245,158,11,.12); color: var(--orange); border: 1px solid rgba(245,158,11,.3); }
|
||||
.chip-ok { background: rgba(16,185,129,.12); color: var(--green); border: 1px solid rgba(16,185,129,.3); }
|
||||
|
||||
.status-meta {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 12px;
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
.last-check { font-size: 12px; color: var(--text-sub); }
|
||||
|
||||
.btn-refresh {
|
||||
background: var(--blue-dim);
|
||||
border: 1px solid rgba(0,111,255,.3);
|
||||
color: var(--blue);
|
||||
border-radius: 6px;
|
||||
padding: 4px 12px;
|
||||
font-size: 12px;
|
||||
cursor: pointer;
|
||||
transition: background .15s;
|
||||
}
|
||||
.btn-refresh:hover { background: rgba(0,111,255,.2); }
|
||||
|
||||
/* ── Sections ────────────────────────────────────────────────────────── */
|
||||
.section { margin-bottom: 32px; }
|
||||
|
||||
.section-title {
|
||||
font-size: 16px;
|
||||
font-weight: 700;
|
||||
margin-bottom: 14px;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 8px;
|
||||
}
|
||||
|
||||
.interface-item {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 10px;
|
||||
.section-badge {
|
||||
font-size: 11px;
|
||||
font-weight: 600;
|
||||
background: var(--red);
|
||||
color: white;
|
||||
padding: 2px 7px;
|
||||
border-radius: 10px;
|
||||
}
|
||||
|
||||
.label {
|
||||
font-weight: 500;
|
||||
color: #666;
|
||||
.section-badge:not(.badge-critical) {
|
||||
background: var(--grey);
|
||||
}
|
||||
|
||||
.value {
|
||||
font-family: monospace;
|
||||
/* ── Topology diagram ────────────────────────────────────────────────── */
|
||||
.topology {
|
||||
background: var(--card-bg);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius);
|
||||
padding: 20px 16px 16px;
|
||||
margin-bottom: 20px;
|
||||
text-align: center;
|
||||
box-shadow: var(--shadow);
|
||||
overflow-x: auto;
|
||||
}
|
||||
.interface-header {
|
||||
|
||||
.topo-row {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
justify-content: center;
|
||||
gap: 16px;
|
||||
flex-wrap: wrap;
|
||||
}
|
||||
|
||||
.topo-row-internet { margin-bottom: 4px; }
|
||||
.topo-hosts-row { flex-wrap: wrap; gap: 12px; }
|
||||
|
||||
.topo-connectors {
|
||||
display: flex;
|
||||
justify-content: center;
|
||||
gap: 80px;
|
||||
height: 20px;
|
||||
margin: 2px 0;
|
||||
}
|
||||
|
||||
.topo-connectors.single { gap: 0; }
|
||||
.topo-connectors.wide { gap: 60px; }
|
||||
|
||||
.topo-line {
|
||||
width: 2px;
|
||||
height: 100%;
|
||||
background: var(--border);
|
||||
}
|
||||
|
||||
.topo-line-labeled {
|
||||
position: relative;
|
||||
}
|
||||
.topo-line-labeled::after {
|
||||
content: attr(data-link-label);
|
||||
position: absolute;
|
||||
left: 6px;
|
||||
top: 50%;
|
||||
transform: translateY(-50%);
|
||||
font-size: 10px;
|
||||
color: var(--text-dim);
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
.topo-node {
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
align-items: center;
|
||||
padding: 10px;
|
||||
cursor: pointer;
|
||||
background: rgba(0,111,255,0.05);
|
||||
gap: 4px;
|
||||
padding: 8px 14px;
|
||||
border-radius: 8px;
|
||||
margin-bottom: 5px;
|
||||
border: 1.5px solid var(--border);
|
||||
background: var(--grey-lt);
|
||||
min-width: 100px;
|
||||
font-size: 12px;
|
||||
position: relative;
|
||||
transition: border-color .2s;
|
||||
}
|
||||
|
||||
.interface-header:hover {
|
||||
background: rgba(0,111,255,0.1);
|
||||
.topo-internet {
|
||||
border-color: var(--blue);
|
||||
background: var(--blue-dim);
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.interface-list {
|
||||
max-height: 500px;
|
||||
overflow-y: auto;
|
||||
transition: max-height 0.3s ease-out;
|
||||
.topo-switch {
|
||||
border-color: var(--blue);
|
||||
background: var(--blue-dim);
|
||||
}
|
||||
|
||||
.interface-list.collapsed {
|
||||
max-height: 0;
|
||||
.topo-host { cursor: default; }
|
||||
|
||||
.topo-icon { font-size: 16px; }
|
||||
|
||||
.topo-label {
|
||||
font-weight: 500;
|
||||
font-size: 11px;
|
||||
text-align: center;
|
||||
}
|
||||
|
||||
.topo-badge {
|
||||
font-size: 10px;
|
||||
padding: 2px 6px;
|
||||
border-radius: 4px;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.topo-badge-up { background: rgba(16,185,129,.15); color: var(--green); }
|
||||
.topo-badge-down { background: rgba(239,68,68,.15); color: var(--red); }
|
||||
.topo-badge-degraded { background: rgba(245,158,11,.15); color: var(--orange); }
|
||||
|
||||
.topo-status-{{ 'up' }} { border-color: var(--green); }
|
||||
.topo-status-down { border-color: var(--red); }
|
||||
.topo-status-degraded { border-color: var(--orange); }
|
||||
|
||||
.topo-status-up { border-color: var(--green); }
|
||||
.topo-status-dot {
|
||||
width: 8px; height: 8px;
|
||||
border-radius: 50%;
|
||||
background: var(--grey);
|
||||
position: absolute;
|
||||
top: 6px; right: 6px;
|
||||
}
|
||||
|
||||
/* ── Host cards ──────────────────────────────────────────────────────── */
|
||||
.host-grid {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fill, minmax(240px, 1fr));
|
||||
gap: 14px;
|
||||
}
|
||||
|
||||
.host-card {
|
||||
background: var(--card-bg);
|
||||
border: 1.5px solid var(--border);
|
||||
border-radius: var(--radius);
|
||||
padding: 14px;
|
||||
box-shadow: var(--shadow);
|
||||
transition: border-color .2s, box-shadow .2s;
|
||||
}
|
||||
|
||||
.host-card:hover { box-shadow: 0 4px 16px rgba(0,0,0,.1); }
|
||||
|
||||
.host-card-up { border-left: 4px solid var(--green); }
|
||||
.host-card-down { border-left: 4px solid var(--red); }
|
||||
.host-card-degraded { border-left: 4px solid var(--orange); }
|
||||
|
||||
.host-card-header { margin-bottom: 10px; }
|
||||
|
||||
.host-name-row {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 7px;
|
||||
margin-bottom: 4px;
|
||||
}
|
||||
|
||||
.host-name {
|
||||
font-weight: 700;
|
||||
font-size: 14px;
|
||||
}
|
||||
|
||||
.host-meta {
|
||||
display: flex;
|
||||
gap: 8px;
|
||||
align-items: center;
|
||||
}
|
||||
|
||||
.host-ip {
|
||||
font-family: var(--mono);
|
||||
font-size: 11px;
|
||||
color: var(--text-sub);
|
||||
}
|
||||
|
||||
.host-source {
|
||||
font-size: 10px;
|
||||
padding: 1px 6px;
|
||||
border-radius: 4px;
|
||||
font-weight: 600;
|
||||
background: var(--grey-lt);
|
||||
color: var(--text-sub);
|
||||
}
|
||||
|
||||
.source-prometheus { color: #E6522C; background: rgba(230,82,44,.1); }
|
||||
.source-ping { color: var(--blue); background: var(--blue-dim); }
|
||||
|
||||
.iface-list {
|
||||
border-top: 1px solid var(--border);
|
||||
padding-top: 8px;
|
||||
margin-bottom: 10px;
|
||||
}
|
||||
|
||||
.iface-row {
|
||||
display: flex;
|
||||
align-items: center;
|
||||
gap: 7px;
|
||||
padding: 3px 0;
|
||||
}
|
||||
|
||||
.iface-name {
|
||||
font-family: var(--mono);
|
||||
font-size: 12px;
|
||||
flex: 1;
|
||||
color: var(--text);
|
||||
}
|
||||
|
||||
.iface-state {
|
||||
font-size: 11px;
|
||||
font-weight: 600;
|
||||
}
|
||||
|
||||
.state-up { color: var(--green); }
|
||||
.state-down { color: var(--red); }
|
||||
|
||||
.host-ping-note {
|
||||
font-size: 11px;
|
||||
color: var(--text-sub);
|
||||
font-style: italic;
|
||||
margin-bottom: 10px;
|
||||
padding-top: 6px;
|
||||
border-top: 1px solid var(--border);
|
||||
}
|
||||
|
||||
.host-actions {
|
||||
border-top: 1px solid var(--border);
|
||||
padding-top: 8px;
|
||||
}
|
||||
|
||||
/* ── Status dots ─────────────────────────────────────────────────────── */
|
||||
.host-status-dot, .iface-dot, .dot-up, .dot-down, .dot-degraded, .dot-unknown {
|
||||
display: inline-block;
|
||||
width: 10px;
|
||||
height: 10px;
|
||||
border-radius: 50%;
|
||||
flex-shrink: 0;
|
||||
}
|
||||
|
||||
.dot-up, .host-status-dot.dot-up { background: var(--green); box-shadow: 0 0 0 2px rgba(16,185,129,.2); }
|
||||
.dot-down, .host-status-dot.dot-down { background: var(--red); box-shadow: 0 0 0 2px rgba(239,68,68,.2); animation: pulse-red 2s infinite; }
|
||||
.dot-degraded { background: var(--orange); box-shadow: 0 0 0 2px rgba(245,158,11,.2); }
|
||||
.dot-unknown { background: var(--grey); }
|
||||
|
||||
@keyframes pulse-red {
|
||||
0%,100% { box-shadow: 0 0 0 2px rgba(239,68,68,.2); }
|
||||
50% { box-shadow: 0 0 0 5px rgba(239,68,68,.4); }
|
||||
}
|
||||
|
||||
/* ── Badges ──────────────────────────────────────────────────────────── */
|
||||
.badge {
|
||||
display: inline-block;
|
||||
padding: 2px 8px;
|
||||
border-radius: 6px;
|
||||
font-size: 11px;
|
||||
font-weight: 700;
|
||||
text-transform: uppercase;
|
||||
letter-spacing: .04em;
|
||||
}
|
||||
|
||||
.badge-critical { background: rgba(239,68,68,.12); color: var(--red); }
|
||||
.badge-warning { background: rgba(245,158,11,.12); color: var(--orange); }
|
||||
.badge-info { background: rgba(0,111,255,.1); color: var(--blue); }
|
||||
.badge-ok { background: rgba(16,185,129,.12); color: var(--green); }
|
||||
.badge-neutral { background: var(--grey-lt); color: var(--grey); }
|
||||
.badge-suppressed { background: rgba(107,114,128,.12); color: var(--grey); font-size: 14px; padding: 0; }
|
||||
|
||||
/* ── Tables ──────────────────────────────────────────────────────────── */
|
||||
.table-wrap {
|
||||
background: var(--card-bg);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius);
|
||||
box-shadow: var(--shadow);
|
||||
overflow: hidden;
|
||||
}
|
||||
|
||||
.interface-item {
|
||||
display: grid;
|
||||
grid-template-columns: 1fr 1fr auto;
|
||||
padding: 8px;
|
||||
border-bottom: 1px solid #eee;
|
||||
.data-table {
|
||||
width: 100%;
|
||||
border-collapse: collapse;
|
||||
}
|
||||
|
||||
.data-table th {
|
||||
background: var(--grey-lt);
|
||||
padding: 10px 14px;
|
||||
text-align: left;
|
||||
font-size: 11px;
|
||||
font-weight: 700;
|
||||
color: var(--text-sub);
|
||||
text-transform: uppercase;
|
||||
letter-spacing: .06em;
|
||||
border-bottom: 1px solid var(--border);
|
||||
white-space: nowrap;
|
||||
}
|
||||
|
||||
.data-table td {
|
||||
padding: 10px 14px;
|
||||
border-bottom: 1px solid var(--border);
|
||||
vertical-align: middle;
|
||||
}
|
||||
|
||||
.data-table tr:last-child td { border-bottom: none; }
|
||||
|
||||
.data-table tr:hover td { background: rgba(0,111,255,.03); }
|
||||
|
||||
.row-critical td { background: rgba(239,68,68,.04); }
|
||||
.row-critical td:first-child { border-left: 3px solid var(--red); }
|
||||
|
||||
.row-warning td { background: rgba(245,158,11,.04); }
|
||||
.row-warning td:first-child { border-left: 3px solid var(--orange); }
|
||||
|
||||
.row-resolved td { opacity: .6; }
|
||||
|
||||
.data-table-sm td, .data-table-sm th { padding: 7px 12px; font-size: 12px; }
|
||||
|
||||
.ts-cell { font-family: var(--mono); font-size: 11px; color: var(--text-sub); }
|
||||
.desc-cell { max-width: 300px; white-space: nowrap; overflow: hidden; text-overflow: ellipsis; }
|
||||
.ticket-link { font-family: var(--mono); font-weight: 600; }
|
||||
|
||||
.empty-state { padding: 32px; text-align: center; color: var(--text-sub); }
|
||||
.empty-row td { text-align: center; color: var(--text-sub); }
|
||||
|
||||
/* ── Buttons ─────────────────────────────────────────────────────────── */
|
||||
.btn {
|
||||
display: inline-flex;
|
||||
align-items: center;
|
||||
gap: 6px;
|
||||
padding: 8px 16px;
|
||||
border-radius: 6px;
|
||||
border: none;
|
||||
cursor: pointer;
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
transition: opacity .15s, background .15s;
|
||||
}
|
||||
|
||||
.port-status {
|
||||
padding: 4px 8px;
|
||||
.btn:hover { opacity: .88; }
|
||||
.btn:active { opacity: .75; }
|
||||
|
||||
.btn-primary { background: var(--blue); color: white; }
|
||||
.btn-secondary { background: var(--grey-lt); color: var(--text); border: 1px solid var(--border); }
|
||||
.btn-danger { background: rgba(239,68,68,.1); color: var(--red); border: 1px solid rgba(239,68,68,.2); }
|
||||
.btn-lg { padding: 10px 20px; font-size: 14px; }
|
||||
|
||||
.btn-sm {
|
||||
padding: 3px 8px;
|
||||
font-size: 11px;
|
||||
border-radius: 5px;
|
||||
cursor: pointer;
|
||||
border: none;
|
||||
font-weight: 600;
|
||||
transition: opacity .15s;
|
||||
}
|
||||
|
||||
.btn-suppress {
|
||||
background: rgba(107,114,128,.1);
|
||||
color: var(--grey);
|
||||
border: 1px solid var(--border) !important;
|
||||
}
|
||||
|
||||
.btn-suppress:hover { background: rgba(107,114,128,.2); }
|
||||
|
||||
.btn-danger.btn-sm {
|
||||
background: rgba(239,68,68,.1);
|
||||
color: var(--red);
|
||||
border: 1px solid rgba(239,68,68,.2) !important;
|
||||
}
|
||||
|
||||
/* ── Modal ───────────────────────────────────────────────────────────── */
|
||||
.modal-overlay {
|
||||
position: fixed;
|
||||
inset: 0;
|
||||
background: rgba(0,0,0,.45);
|
||||
z-index: 100;
|
||||
display: flex;
|
||||
align-items: center;
|
||||
justify-content: center;
|
||||
backdrop-filter: blur(2px);
|
||||
}
|
||||
|
||||
.modal {
|
||||
background: var(--card-bg);
|
||||
border-radius: 12px;
|
||||
box-shadow: 0 20px 60px rgba(0,0,0,.2);
|
||||
width: 480px;
|
||||
max-width: 95vw;
|
||||
padding: 24px;
|
||||
}
|
||||
|
||||
.modal-header {
|
||||
display: flex;
|
||||
justify-content: space-between;
|
||||
align-items: center;
|
||||
margin-bottom: 20px;
|
||||
}
|
||||
|
||||
.modal-header h3 { font-size: 17px; font-weight: 700; }
|
||||
|
||||
.modal-close {
|
||||
background: none;
|
||||
border: none;
|
||||
cursor: pointer;
|
||||
font-size: 18px;
|
||||
color: var(--text-sub);
|
||||
line-height: 1;
|
||||
padding: 2px 6px;
|
||||
border-radius: 4px;
|
||||
font-size: 0.8em;
|
||||
font-weight: 500;
|
||||
transition: background .15s;
|
||||
}
|
||||
|
||||
.port-status.up {
|
||||
background-color: #10B981;
|
||||
.modal-close:hover { background: var(--grey-lt); }
|
||||
|
||||
.modal-actions {
|
||||
display: flex;
|
||||
gap: 10px;
|
||||
justify-content: flex-end;
|
||||
margin-top: 20px;
|
||||
padding-top: 16px;
|
||||
border-top: 1px solid var(--border);
|
||||
}
|
||||
|
||||
/* ── Forms ───────────────────────────────────────────────────────────── */
|
||||
.form-card {
|
||||
background: var(--card-bg);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius);
|
||||
padding: 20px;
|
||||
box-shadow: var(--shadow);
|
||||
}
|
||||
|
||||
.form-row {
|
||||
display: flex;
|
||||
gap: 16px;
|
||||
flex-wrap: wrap;
|
||||
margin-bottom: 14px;
|
||||
}
|
||||
|
||||
.form-row-align { align-items: flex-end; }
|
||||
|
||||
.form-group { display: flex; flex-direction: column; gap: 5px; min-width: 180px; flex: 1; }
|
||||
.form-group-wide { flex: 3; }
|
||||
.form-group-submit { flex: 0 0 auto; min-width: unset; }
|
||||
|
||||
.form-group label {
|
||||
font-size: 12px;
|
||||
font-weight: 600;
|
||||
color: var(--text-sub);
|
||||
text-transform: uppercase;
|
||||
letter-spacing: .05em;
|
||||
}
|
||||
|
||||
.form-group input,
|
||||
.form-group select {
|
||||
padding: 8px 10px;
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 6px;
|
||||
font-size: 13px;
|
||||
background: white;
|
||||
color: var(--text);
|
||||
transition: border-color .15s, box-shadow .15s;
|
||||
}
|
||||
|
||||
.form-group input:focus,
|
||||
.form-group select:focus {
|
||||
outline: none;
|
||||
border-color: var(--blue);
|
||||
box-shadow: 0 0 0 3px var(--blue-dim);
|
||||
}
|
||||
|
||||
.form-hint { font-size: 11px; color: var(--text-sub); margin-top: 2px; }
|
||||
.required { color: var(--red); }
|
||||
|
||||
/* ── Duration pills ──────────────────────────────────────────────────── */
|
||||
.duration-pills {
|
||||
display: flex;
|
||||
gap: 6px;
|
||||
flex-wrap: wrap;
|
||||
margin-bottom: 6px;
|
||||
}
|
||||
|
||||
.pill {
|
||||
padding: 5px 12px;
|
||||
border-radius: 20px;
|
||||
border: 1.5px solid var(--border);
|
||||
background: white;
|
||||
font-size: 12px;
|
||||
font-weight: 600;
|
||||
cursor: pointer;
|
||||
color: var(--text-sub);
|
||||
transition: all .15s;
|
||||
}
|
||||
|
||||
.pill:hover { border-color: var(--blue); color: var(--blue); }
|
||||
|
||||
.pill.active,
|
||||
.pill-manual.active {
|
||||
background: var(--blue);
|
||||
border-color: var(--blue);
|
||||
color: white;
|
||||
}
|
||||
|
||||
.port-status.down {
|
||||
background-color: #EF4444;
|
||||
color: white;
|
||||
}
|
||||
|
||||
.expand-icon {
|
||||
transition: transform 0.3s ease;
|
||||
}
|
||||
|
||||
.collapsed + .expand-icon {
|
||||
transform: rotate(-90deg);
|
||||
}
|
||||
|
||||
.port-speed {
|
||||
font-family: monospace;
|
||||
color: var(--secondary-color);
|
||||
}
|
||||
.metrics-list {
|
||||
/* ── Targets grid (suppressions page) ───────────────────────────────── */
|
||||
.targets-grid {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(3, 1fr);
|
||||
gap: 15px;
|
||||
margin-top: 10px;
|
||||
grid-template-columns: repeat(auto-fill, minmax(200px, 1fr));
|
||||
gap: 12px;
|
||||
}
|
||||
|
||||
.metric-item {
|
||||
background: rgba(0,111,255,0.1);
|
||||
padding: 10px;
|
||||
.target-card {
|
||||
background: var(--card-bg);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: 8px;
|
||||
text-align: center;
|
||||
}
|
||||
.online {
|
||||
color: #10B981;
|
||||
padding: 12px;
|
||||
}
|
||||
|
||||
.offline {
|
||||
color: #EF4444;
|
||||
.target-name {
|
||||
font-weight: 700;
|
||||
font-size: 14px;
|
||||
margin-bottom: 4px;
|
||||
}
|
||||
|
||||
.interface-grid {
|
||||
display: grid;
|
||||
grid-template-columns: repeat(auto-fit, minmax(250px, 1fr));
|
||||
gap: 15px;
|
||||
.target-type {
|
||||
font-size: 11px;
|
||||
color: var(--text-sub);
|
||||
margin-bottom: 8px;
|
||||
}
|
||||
|
||||
.metric-value {
|
||||
font-family: monospace;
|
||||
font-size: 1.2em;
|
||||
color: var(--primary-color);
|
||||
.target-ifaces {
|
||||
display: flex;
|
||||
flex-wrap: wrap;
|
||||
gap: 4px;
|
||||
}
|
||||
|
||||
.iface-chip {
|
||||
font-family: var(--mono);
|
||||
font-size: 10px;
|
||||
background: var(--grey-lt);
|
||||
border-radius: 4px;
|
||||
padding: 1px 6px;
|
||||
color: var(--text-sub);
|
||||
}
|
||||
|
||||
/* ── Card (generic) ──────────────────────────────────────────────────── */
|
||||
.card {
|
||||
background: var(--card-bg);
|
||||
border: 1px solid var(--border);
|
||||
border-radius: var(--radius);
|
||||
padding: 20px;
|
||||
box-shadow: var(--shadow);
|
||||
}
|
||||
|
||||
/* ── Toast notifications ─────────────────────────────────────────────── */
|
||||
.toast-container {
|
||||
position: fixed;
|
||||
bottom: 24px;
|
||||
right: 24px;
|
||||
z-index: 200;
|
||||
display: flex;
|
||||
flex-direction: column;
|
||||
gap: 10px;
|
||||
}
|
||||
|
||||
.toast {
|
||||
padding: 12px 20px;
|
||||
border-radius: 8px;
|
||||
font-size: 13px;
|
||||
font-weight: 600;
|
||||
box-shadow: 0 4px 16px rgba(0,0,0,.15);
|
||||
animation: slide-in .2s ease;
|
||||
}
|
||||
|
||||
.toast-success { background: #065f46; color: white; }
|
||||
.toast-error { background: #7f1d1d; color: white; }
|
||||
|
||||
@keyframes slide-in {
|
||||
from { transform: translateX(120%); opacity: 0; }
|
||||
to { transform: translateX(0); opacity: 1; }
|
||||
}
|
||||
|
||||
/* ── Responsive ──────────────────────────────────────────────────────── */
|
||||
@media (max-width: 768px) {
|
||||
.host-grid { grid-template-columns: 1fr; }
|
||||
.topology { display: none; }
|
||||
.form-row { flex-direction: column; }
|
||||
.status-bar { flex-direction: column; align-items: flex-start; }
|
||||
}
|
||||
|
||||
36
templates/base.html
Normal file
36
templates/base.html
Normal file
@@ -0,0 +1,36 @@
|
||||
<!DOCTYPE html>
|
||||
<html lang="en">
|
||||
<head>
|
||||
<meta charset="UTF-8">
|
||||
<meta name="viewport" content="width=device-width, initial-scale=1.0">
|
||||
<title>{% block title %}GANDALF{% endblock %}</title>
|
||||
<link rel="stylesheet" href="{{ url_for('static', filename='style.css') }}">
|
||||
</head>
|
||||
<body>
|
||||
<nav class="navbar">
|
||||
<div class="nav-brand">
|
||||
<span class="nav-logo">⚡</span>
|
||||
<span class="nav-title">GANDALF</span>
|
||||
<span class="nav-sub">Network Monitor</span>
|
||||
</div>
|
||||
<div class="nav-links">
|
||||
<a href="{{ url_for('index') }}" class="nav-link {% if request.endpoint == 'index' %}active{% endif %}">
|
||||
Dashboard
|
||||
</a>
|
||||
<a href="{{ url_for('suppressions_page') }}" class="nav-link {% if request.endpoint == 'suppressions_page' %}active{% endif %}">
|
||||
Suppressions
|
||||
</a>
|
||||
</div>
|
||||
<div class="nav-user">
|
||||
<span class="nav-user-name">{{ user.name or user.username }}</span>
|
||||
</div>
|
||||
</nav>
|
||||
|
||||
<main class="main">
|
||||
{% block content %}{% endblock %}
|
||||
</main>
|
||||
|
||||
<script src="{{ url_for('static', filename='app.js') }}"></script>
|
||||
{% block scripts %}{% endblock %}
|
||||
</body>
|
||||
</html>
|
||||
@@ -1,69 +1,289 @@
|
||||
<!DOCTYPE html>
|
||||
<html>
|
||||
<head>
|
||||
<title>GANDALF - Network Monitor</title>
|
||||
<link rel="stylesheet" href="{{ url_for('static', filename='style.css') }}">
|
||||
<link href="https://fonts.googleapis.com/css2?family=Inter:wght@400;500;600&display=swap" rel="stylesheet">
|
||||
</head>
|
||||
<body>
|
||||
<div class="container">
|
||||
<div class="header">
|
||||
<h1>GANDALF (Global Advanced Network Detection And Link Facilitator)</h1>
|
||||
<p>Ubiquiti Network Management Dashboard</p>
|
||||
</div>
|
||||
{% extends "base.html" %}
|
||||
{% block title %}Dashboard – GANDALF{% endblock %}
|
||||
|
||||
<div class="metrics-container">
|
||||
<div class="metric-card">
|
||||
<h2>Network Overview</h2>
|
||||
<div id="network-health">
|
||||
{%- for device in devices %}
|
||||
<div class="device-status" data-device-name="{{ device.name }}">
|
||||
<span class="status-indicator"></span>
|
||||
<div class="device-info">
|
||||
<span class="device-name">{{ device.name }}</span>
|
||||
<span class="device-details">{{ device.ip }}</span>
|
||||
<span class="device-type">{{ device.type }} ({{ device.connection_type }})</span>
|
||||
{% if device.critical %}
|
||||
<span class="critical-badge">Critical</span>
|
||||
{% block content %}
|
||||
|
||||
<!-- ── Status bar ─────────────────────────────────────────────────────── -->
|
||||
<div class="status-bar">
|
||||
<div class="status-chips">
|
||||
{% if summary.critical %}
|
||||
<span class="chip chip-critical">⬤ {{ summary.critical }} Critical</span>
|
||||
{% endif %}
|
||||
{% if summary.warning %}
|
||||
<span class="chip chip-warning">⬤ {{ summary.warning }} Warning</span>
|
||||
{% endif %}
|
||||
{% if not summary.critical and not summary.warning %}
|
||||
<span class="chip chip-ok">✔ All systems nominal</span>
|
||||
{% endif %}
|
||||
</div>
|
||||
</div>
|
||||
{%- endfor %}
|
||||
<div class="status-meta">
|
||||
<span class="last-check" id="last-check">Last check: {{ last_check }}</span>
|
||||
<button class="btn-refresh" onclick="refreshAll()">↻ Refresh</button>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="metric-card expandable">
|
||||
<h2>Interface Status</h2>
|
||||
<div id="interface-details">
|
||||
{%- for device in devices %}
|
||||
<div class="interface-group" data-device-name="{{ device.name }}">
|
||||
<div class="interface-header" onclick="toggleInterfaces(this)">
|
||||
<h3>{{ device.name }}</h3>
|
||||
<span class="expand-icon">▼</span>
|
||||
<!-- ── Network topology + host grid ──────────────────────────────────── -->
|
||||
<section class="section">
|
||||
<h2 class="section-title">Network Hosts</h2>
|
||||
|
||||
<!-- Simple topology diagram -->
|
||||
<div class="topology" id="topology-diagram">
|
||||
<div class="topo-row topo-row-internet">
|
||||
<div class="topo-node topo-internet">🌐 Internet</div>
|
||||
</div>
|
||||
<div class="interface-list collapsed"></div>
|
||||
<div class="topo-connectors single">
|
||||
<div class="topo-line"></div>
|
||||
</div>
|
||||
{%- endfor %}
|
||||
<div class="topo-row">
|
||||
<div class="topo-node topo-unifi" id="topo-gateway">
|
||||
<span class="topo-icon">⬡</span>
|
||||
<span class="topo-label">UDM-Pro</span>
|
||||
<span class="topo-status-dot" data-topo-target="gateway"></span>
|
||||
</div>
|
||||
</div>
|
||||
<div class="topo-connectors single">
|
||||
<div class="topo-line topo-line-labeled" data-link-label="10G DAC"></div>
|
||||
</div>
|
||||
<div class="topo-row">
|
||||
<div class="topo-node topo-switch" id="topo-switch-agg">
|
||||
<span class="topo-icon">⬡</span>
|
||||
<span class="topo-label">Agg Switch</span>
|
||||
<span class="topo-status-dot" data-topo-target="switch-agg"></span>
|
||||
</div>
|
||||
</div>
|
||||
<div class="topo-connectors single">
|
||||
<div class="topo-line topo-line-labeled" data-link-label="10G DAC"></div>
|
||||
</div>
|
||||
<div class="topo-row">
|
||||
<div class="topo-node topo-switch" id="topo-switch-poe">
|
||||
<span class="topo-icon">⬡</span>
|
||||
<span class="topo-label">PoE Switch</span>
|
||||
<span class="topo-status-dot" data-topo-target="switch-poe"></span>
|
||||
</div>
|
||||
</div>
|
||||
<div class="topo-connectors wide">
|
||||
{% for name in snapshot.hosts %}
|
||||
<div class="topo-line"></div>
|
||||
{% endfor %}
|
||||
</div>
|
||||
<div class="topo-row topo-hosts-row">
|
||||
{% for name, host in snapshot.hosts.items() %}
|
||||
<div class="topo-node topo-host topo-status-{{ host.status }}" data-host="{{ name }}">
|
||||
<span class="topo-icon">▣</span>
|
||||
<span class="topo-label">{{ name }}</span>
|
||||
<span class="topo-badge topo-badge-{{ host.status }}">{{ host.status }}</span>
|
||||
</div>
|
||||
{% endfor %}
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="metric-card">
|
||||
<h2>System Health</h2>
|
||||
<div id="system-metrics">
|
||||
{%- for device in devices %}
|
||||
<div class="health-metrics" data-device-name="{{ device.name }}">
|
||||
<h3>{{ device.name }}</h3>
|
||||
<div class="metrics-list">
|
||||
<div class="metric-item cpu"></div>
|
||||
<div class="metric-item memory"></div>
|
||||
<div class="metric-item temperature"></div>
|
||||
<!-- Host cards -->
|
||||
<div class="host-grid" id="host-grid">
|
||||
{% for name, host in snapshot.hosts.items() %}
|
||||
{% set suppressed = suppressions | selectattr('target_name', 'equalto', name) | list %}
|
||||
<div class="host-card host-card-{{ host.status }}" data-host="{{ name }}">
|
||||
<div class="host-card-header">
|
||||
<div class="host-name-row">
|
||||
<span class="host-status-dot dot-{{ host.status }}"></span>
|
||||
<span class="host-name">{{ name }}</span>
|
||||
{% if suppressed %}
|
||||
<span class="badge badge-suppressed" title="Suppressed">🔕</span>
|
||||
{% endif %}
|
||||
</div>
|
||||
<div class="host-meta">
|
||||
<span class="host-ip">{{ host.ip }}</span>
|
||||
<span class="host-source source-{{ host.source }}">{{ host.source }}</span>
|
||||
</div>
|
||||
</div>
|
||||
{%- endfor %}
|
||||
|
||||
{% if host.interfaces %}
|
||||
<div class="iface-list">
|
||||
{% for iface, state in host.interfaces.items() | sort %}
|
||||
<div class="iface-row">
|
||||
<span class="iface-dot dot-{{ state }}"></span>
|
||||
<span class="iface-name">{{ iface }}</span>
|
||||
<span class="iface-state state-{{ state }}">{{ state }}</span>
|
||||
</div>
|
||||
{% endfor %}
|
||||
</div>
|
||||
{% else %}
|
||||
<div class="host-ping-note">Monitored via ping only</div>
|
||||
{% endif %}
|
||||
|
||||
<div class="host-actions">
|
||||
<button class="btn-sm btn-suppress"
|
||||
onclick="openSuppressModal('host', '{{ name }}', '')"
|
||||
title="Suppress alerts for this host">
|
||||
🔕 Suppress Host
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
{% else %}
|
||||
<p class="empty-state">No host data yet – monitor is initializing.</p>
|
||||
{% endfor %}
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- ── UniFi devices ──────────────────────────────────────────────────── -->
|
||||
{% if snapshot.unifi %}
|
||||
<section class="section">
|
||||
<h2 class="section-title">UniFi Devices</h2>
|
||||
<div class="table-wrap">
|
||||
<table class="data-table" id="unifi-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Status</th>
|
||||
<th>Name</th>
|
||||
<th>Type</th>
|
||||
<th>Model</th>
|
||||
<th>IP</th>
|
||||
<th>Actions</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for d in snapshot.unifi %}
|
||||
<tr class="{% if not d.connected %}row-critical{% endif %}">
|
||||
<td>
|
||||
<span class="dot-{{ 'up' if d.connected else 'down' }}"></span>
|
||||
{{ 'Online' if d.connected else 'Offline' }}
|
||||
</td>
|
||||
<td><strong>{{ d.name }}</strong></td>
|
||||
<td>{{ d.type }}</td>
|
||||
<td>{{ d.model }}</td>
|
||||
<td>{{ d.ip }}</td>
|
||||
<td>
|
||||
{% if not d.connected %}
|
||||
<button class="btn-sm btn-suppress"
|
||||
onclick="openSuppressModal('unifi_device', '{{ d.name }}', '')">
|
||||
🔕 Suppress
|
||||
</button>
|
||||
{% endif %}
|
||||
</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
</section>
|
||||
{% endif %}
|
||||
|
||||
<!-- ── Active alerts ─────────────────────────────────────────────────── -->
|
||||
<section class="section">
|
||||
<h2 class="section-title">
|
||||
Active Alerts
|
||||
{% if summary.critical or summary.warning %}
|
||||
<span class="section-badge badge-critical">{{ (summary.critical or 0) + (summary.warning or 0) }} open</span>
|
||||
{% endif %}
|
||||
</h2>
|
||||
<div class="table-wrap" id="events-table-wrap">
|
||||
{% if events %}
|
||||
<table class="data-table" id="events-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Severity</th>
|
||||
<th>Type</th>
|
||||
<th>Target</th>
|
||||
<th>Detail</th>
|
||||
<th>Description</th>
|
||||
<th>First Seen</th>
|
||||
<th>Failures</th>
|
||||
<th>Ticket</th>
|
||||
<th>Actions</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for e in events %}
|
||||
{% if e.severity != 'info' %}
|
||||
<tr class="row-{{ e.severity }}">
|
||||
<td><span class="badge badge-{{ e.severity }}">{{ e.severity }}</span></td>
|
||||
<td>{{ e.event_type | replace('_', ' ') }}</td>
|
||||
<td><strong>{{ e.target_name }}</strong></td>
|
||||
<td>{{ e.target_detail or '–' }}</td>
|
||||
<td class="desc-cell" title="{{ e.description }}">{{ e.description | truncate(60) }}</td>
|
||||
<td class="ts-cell">{{ e.first_seen }}</td>
|
||||
<td>{{ e.consecutive_failures }}</td>
|
||||
<td>
|
||||
{% if e.ticket_id %}
|
||||
<a href="http://t.lotusguild.org/ticket/{{ e.ticket_id }}" target="_blank"
|
||||
class="ticket-link">#{{ e.ticket_id }}</a>
|
||||
{% else %}–{% endif %}
|
||||
</td>
|
||||
<td>
|
||||
<button class="btn-sm btn-suppress"
|
||||
onclick="openSuppressModal('{{ 'unifi_device' if e.event_type == 'unifi_device_offline' else 'interface' if e.event_type == 'interface_down' else 'host' }}', '{{ e.target_name }}', '{{ e.target_detail or '' }}')"
|
||||
title="Suppress this alert">
|
||||
🔕
|
||||
</button>
|
||||
</td>
|
||||
</tr>
|
||||
{% endif %}
|
||||
{% else %}
|
||||
<tr class="empty-row">
|
||||
<td colspan="9" class="empty-state">No active alerts ✔</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
{% else %}
|
||||
<p class="empty-state">No active alerts ✔</p>
|
||||
{% endif %}
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- ── Quick-suppress modal ───────────────────────────────────────────── -->
|
||||
<div id="suppress-modal" class="modal-overlay" style="display:none">
|
||||
<div class="modal">
|
||||
<div class="modal-header">
|
||||
<h3>Suppress Alert</h3>
|
||||
<button class="modal-close" onclick="closeSuppressModal()">✕</button>
|
||||
</div>
|
||||
<form id="suppress-form" onsubmit="submitSuppress(event)">
|
||||
<div class="form-group">
|
||||
<label>Target Type</label>
|
||||
<select id="sup-type" name="target_type" onchange="updateSuppressForm()">
|
||||
<option value="host">Host (all interfaces)</option>
|
||||
<option value="interface">Specific Interface</option>
|
||||
<option value="unifi_device">UniFi Device</option>
|
||||
<option value="all">Everything (global maintenance)</option>
|
||||
</select>
|
||||
</div>
|
||||
<div class="form-group" id="sup-name-group">
|
||||
<label>Target Name</label>
|
||||
<input type="text" id="sup-name" name="target_name" placeholder="e.g. large1">
|
||||
</div>
|
||||
<div class="form-group" id="sup-detail-group">
|
||||
<label>Interface Name <span class="form-hint">(for interface type)</span></label>
|
||||
<input type="text" id="sup-detail" name="target_detail" placeholder="e.g. enp35s0">
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Reason <span class="required">*</span></label>
|
||||
<input type="text" id="sup-reason" name="reason" placeholder="e.g. Planned switch reboot" required>
|
||||
</div>
|
||||
<div class="form-group">
|
||||
<label>Duration</label>
|
||||
<div class="duration-pills">
|
||||
<button type="button" class="pill" onclick="setDuration(30)">30 min</button>
|
||||
<button type="button" class="pill" onclick="setDuration(60)">1 hr</button>
|
||||
<button type="button" class="pill" onclick="setDuration(240)">4 hr</button>
|
||||
<button type="button" class="pill" onclick="setDuration(480)">8 hr</button>
|
||||
<button type="button" class="pill pill-manual active" onclick="setDuration(null)">Manual</button>
|
||||
</div>
|
||||
<input type="hidden" id="sup-expires" name="expires_minutes" value="">
|
||||
<div class="form-hint" id="duration-hint">Suppression will persist until manually removed.</div>
|
||||
</div>
|
||||
<div class="modal-actions">
|
||||
<button type="button" class="btn btn-secondary" onclick="closeSuppressModal()">Cancel</button>
|
||||
<button type="submit" class="btn btn-primary">Apply Suppression</button>
|
||||
</div>
|
||||
</form>
|
||||
</div>
|
||||
</div>
|
||||
<script src="{{ url_for('static', filename='app.js') }}"></script>
|
||||
</body>
|
||||
</html>
|
||||
|
||||
{% endblock %}
|
||||
|
||||
{% block scripts %}
|
||||
<script>
|
||||
// Auto-refresh every 30 seconds
|
||||
setInterval(refreshAll, 30000);
|
||||
</script>
|
||||
{% endblock %}
|
||||
|
||||
252
templates/suppressions.html
Normal file
252
templates/suppressions.html
Normal file
@@ -0,0 +1,252 @@
|
||||
{% extends "base.html" %}
|
||||
{% block title %}Suppressions – GANDALF{% endblock %}
|
||||
|
||||
{% block content %}
|
||||
|
||||
<div class="page-header">
|
||||
<h1 class="page-title">Alert Suppressions</h1>
|
||||
<p class="page-sub">Manage maintenance windows and alert suppression rules.</p>
|
||||
</div>
|
||||
|
||||
<!-- ── Create suppression ─────────────────────────────────────────────── -->
|
||||
<section class="section">
|
||||
<h2 class="section-title">Create Suppression</h2>
|
||||
<div class="card form-card">
|
||||
<form id="create-suppression-form" onsubmit="createSuppression(event)">
|
||||
<div class="form-row">
|
||||
<div class="form-group">
|
||||
<label for="s-type">Target Type <span class="required">*</span></label>
|
||||
<select id="s-type" name="target_type" onchange="onTypeChange()">
|
||||
<option value="host">Host (all interfaces)</option>
|
||||
<option value="interface">Specific Interface</option>
|
||||
<option value="unifi_device">UniFi Device</option>
|
||||
<option value="all">Global (suppress everything)</option>
|
||||
</select>
|
||||
</div>
|
||||
|
||||
<div class="form-group" id="name-group">
|
||||
<label for="s-name">Target Name <span class="required">*</span></label>
|
||||
<input type="text" id="s-name" name="target_name"
|
||||
placeholder="hostname or device name" autocomplete="off">
|
||||
</div>
|
||||
|
||||
<div class="form-group" id="detail-group" style="display:none">
|
||||
<label for="s-detail">Interface Name</label>
|
||||
<input type="text" id="s-detail" name="target_detail"
|
||||
placeholder="e.g. enp35s0 or bond0" autocomplete="off">
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="form-row">
|
||||
<div class="form-group form-group-wide">
|
||||
<label for="s-reason">Reason <span class="required">*</span></label>
|
||||
<input type="text" id="s-reason" name="reason"
|
||||
placeholder="e.g. Planned switch maintenance, replacing SFP on large1/enp43s0"
|
||||
required>
|
||||
</div>
|
||||
</div>
|
||||
|
||||
<div class="form-row form-row-align">
|
||||
<div class="form-group">
|
||||
<label>Duration</label>
|
||||
<div class="duration-pills">
|
||||
<button type="button" class="pill" onclick="setDur(30)">30 min</button>
|
||||
<button type="button" class="pill" onclick="setDur(60)">1 hr</button>
|
||||
<button type="button" class="pill" onclick="setDur(240)">4 hr</button>
|
||||
<button type="button" class="pill" onclick="setDur(480)">8 hr</button>
|
||||
<button type="button" class="pill pill-manual active" onclick="setDur(null)">Manual ∞</button>
|
||||
</div>
|
||||
<input type="hidden" id="s-expires" name="expires_minutes" value="">
|
||||
<div class="form-hint" id="s-dur-hint">
|
||||
This suppression will persist until manually removed.
|
||||
</div>
|
||||
</div>
|
||||
<div class="form-group form-group-submit">
|
||||
<button type="submit" class="btn btn-primary btn-lg">
|
||||
🔕 Apply Suppression
|
||||
</button>
|
||||
</div>
|
||||
</div>
|
||||
</form>
|
||||
</div>
|
||||
</section>
|
||||
|
||||
<!-- ── Active suppressions ────────────────────────────────────────────── -->
|
||||
<section class="section">
|
||||
<h2 class="section-title">
|
||||
Active Suppressions
|
||||
<span class="section-badge">{{ active | length }}</span>
|
||||
</h2>
|
||||
{% if active %}
|
||||
<div class="table-wrap">
|
||||
<table class="data-table" id="active-sup-table">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<th>Target</th>
|
||||
<th>Detail</th>
|
||||
<th>Reason</th>
|
||||
<th>By</th>
|
||||
<th>Created</th>
|
||||
<th>Expires</th>
|
||||
<th>Actions</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for s in active %}
|
||||
<tr id="sup-row-{{ s.id }}">
|
||||
<td><span class="badge badge-info">{{ s.target_type }}</span></td>
|
||||
<td>{{ s.target_name or '<em>all</em>' | safe }}</td>
|
||||
<td>{{ s.target_detail or '–' }}</td>
|
||||
<td>{{ s.reason }}</td>
|
||||
<td>{{ s.suppressed_by }}</td>
|
||||
<td class="ts-cell">{{ s.created_at }}</td>
|
||||
<td class="ts-cell">
|
||||
{% if s.expires_at %}{{ s.expires_at }}{% else %}<em>manual</em>{% endif %}
|
||||
</td>
|
||||
<td>
|
||||
<button class="btn-sm btn-danger"
|
||||
onclick="removeSuppression({{ s.id }})">
|
||||
Remove
|
||||
</button>
|
||||
</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
{% else %}
|
||||
<p class="empty-state">No active suppressions.</p>
|
||||
{% endif %}
|
||||
</section>
|
||||
|
||||
<!-- ── Suppression history ────────────────────────────────────────────── -->
|
||||
<section class="section">
|
||||
<h2 class="section-title">History <span class="section-badge">{{ history | length }}</span></h2>
|
||||
{% if history %}
|
||||
<div class="table-wrap">
|
||||
<table class="data-table data-table-sm">
|
||||
<thead>
|
||||
<tr>
|
||||
<th>Type</th>
|
||||
<th>Target</th>
|
||||
<th>Detail</th>
|
||||
<th>Reason</th>
|
||||
<th>By</th>
|
||||
<th>Created</th>
|
||||
<th>Expires</th>
|
||||
<th>Active</th>
|
||||
</tr>
|
||||
</thead>
|
||||
<tbody>
|
||||
{% for s in history %}
|
||||
<tr class="{% if not s.active %}row-resolved{% endif %}">
|
||||
<td>{{ s.target_type }}</td>
|
||||
<td>{{ s.target_name or 'all' }}</td>
|
||||
<td>{{ s.target_detail or '–' }}</td>
|
||||
<td>{{ s.reason }}</td>
|
||||
<td>{{ s.suppressed_by }}</td>
|
||||
<td class="ts-cell">{{ s.created_at }}</td>
|
||||
<td class="ts-cell">
|
||||
{% if s.expires_at %}{{ s.expires_at }}{% else %}<em>manual</em>{% endif %}
|
||||
</td>
|
||||
<td>
|
||||
{% if s.active %}
|
||||
<span class="badge badge-ok">Yes</span>
|
||||
{% else %}
|
||||
<span class="badge badge-neutral">No</span>
|
||||
{% endif %}
|
||||
</td>
|
||||
</tr>
|
||||
{% endfor %}
|
||||
</tbody>
|
||||
</table>
|
||||
</div>
|
||||
{% else %}
|
||||
<p class="empty-state">No suppression history yet.</p>
|
||||
{% endif %}
|
||||
</section>
|
||||
|
||||
<!-- ── Available targets reference ───────────────────────────────────── -->
|
||||
<section class="section">
|
||||
<h2 class="section-title">Available Targets</h2>
|
||||
<div class="targets-grid">
|
||||
{% for name, host in snapshot.hosts.items() %}
|
||||
<div class="target-card">
|
||||
<div class="target-name">{{ name }}</div>
|
||||
<div class="target-type">Proxmox Host</div>
|
||||
{% if host.interfaces %}
|
||||
<div class="target-ifaces">
|
||||
{% for iface in host.interfaces.keys() | sort %}
|
||||
<code class="iface-chip">{{ iface }}</code>
|
||||
{% endfor %}
|
||||
</div>
|
||||
{% else %}
|
||||
<div class="target-type">ping-only</div>
|
||||
{% endif %}
|
||||
</div>
|
||||
{% endfor %}
|
||||
</div>
|
||||
</section>
|
||||
|
||||
{% endblock %}
|
||||
|
||||
{% block scripts %}
|
||||
<script>
|
||||
function onTypeChange() {
|
||||
const t = document.getElementById('s-type').value;
|
||||
const nameGrp = document.getElementById('name-group');
|
||||
const detailGrp = document.getElementById('detail-group');
|
||||
nameGrp.style.display = (t === 'all') ? 'none' : '';
|
||||
detailGrp.style.display = (t === 'interface') ? '' : 'none';
|
||||
document.getElementById('s-name').required = (t !== 'all');
|
||||
}
|
||||
|
||||
function setDur(mins) {
|
||||
document.getElementById('s-expires').value = mins || '';
|
||||
document.querySelectorAll('.duration-pills .pill').forEach(p => p.classList.remove('active'));
|
||||
event.target.classList.add('active');
|
||||
const hint = document.getElementById('s-dur-hint');
|
||||
if (mins) {
|
||||
const h = Math.floor(mins / 60), m = mins % 60;
|
||||
hint.textContent = `Suppression expires in ${h ? h+'h ' : ''}${m ? m+'m' : ''}.`;
|
||||
} else {
|
||||
hint.textContent = 'This suppression will persist until manually removed.';
|
||||
}
|
||||
}
|
||||
|
||||
async function createSuppression(e) {
|
||||
e.preventDefault();
|
||||
const form = e.target;
|
||||
const payload = {
|
||||
target_type: form.target_type.value,
|
||||
target_name: form.target_name ? form.target_name.value : '',
|
||||
target_detail: document.getElementById('s-detail').value,
|
||||
reason: form.reason.value,
|
||||
expires_minutes: form.expires_minutes.value ? parseInt(form.expires_minutes.value) : null,
|
||||
};
|
||||
const resp = await fetch('/api/suppressions', {
|
||||
method: 'POST',
|
||||
headers: {'Content-Type': 'application/json'},
|
||||
body: JSON.stringify(payload),
|
||||
});
|
||||
const data = await resp.json();
|
||||
if (data.success) {
|
||||
showToast('Suppression applied', 'success');
|
||||
setTimeout(() => location.reload(), 800);
|
||||
} else {
|
||||
showToast(data.error || 'Error applying suppression', 'error');
|
||||
}
|
||||
}
|
||||
|
||||
async function removeSuppression(id) {
|
||||
if (!confirm('Remove this suppression?')) return;
|
||||
const resp = await fetch(`/api/suppressions/${id}`, { method: 'DELETE' });
|
||||
const data = await resp.json();
|
||||
if (data.success) {
|
||||
document.getElementById(`sup-row-${id}`)?.remove();
|
||||
showToast('Suppression removed', 'success');
|
||||
}
|
||||
}
|
||||
</script>
|
||||
{% endblock %}
|
||||
Reference in New Issue
Block a user