hwmonDaemon/grafana-dashboard.json at 6d959eff0275e4ca596ca416de1e7b76f3f53de8

Files

Jared Vititoe 0f8918fb8b Add Ceph cluster monitoring and Prometheus metrics export

- Add comprehensive Ceph cluster health monitoring
  - Check cluster health status (HEALTH_OK/WARN/ERR)
  - Monitor cluster usage with configurable thresholds
  - Track OSD status (up/down) per node
  - Separate cluster-wide vs node-specific issues

- Cluster-wide ticket deduplication
  - Add [cluster-wide] scope tag for Ceph issues
  - Cluster-wide issues deduplicate across all nodes
  - Node-specific issues (OSD down) include hostname

- Add Prometheus metrics export
  - export_prometheus_metrics() method
  - write_prometheus_metrics() for textfile collector
  - --metrics CLI flag to output metrics to stdout
  - --export-json CLI flag to export health report as JSON

- Add Grafana dashboard template (grafana-dashboard.json)
- Add .gitignore

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

2026-01-17 15:54:16 -05:00

17 KiB

Raw Blame History

View Raw

17 KiB Raw Blame History

17 KiB

Raw Blame History