62 lines
1.5 KiB
Markdown
62 lines
1.5 KiB
Markdown
# GANDALF (Global Advanced Network Detection And Link Facilitator)
|
|
|
|
> Because it shall not let problems pass!
|
|
|
|
## Multiple Distributed Servers Approach
|
|
|
|
This architecture represents the most robust implementation approach for the system.
|
|
|
|
### Core Components
|
|
|
|
1. Multiple monitoring nodes across different network segments
|
|
2. Distributed database for sharing state
|
|
3. Consensus mechanism for alert verification
|
|
|
|
### System Architecture
|
|
|
|
#### A. Monitoring Layer
|
|
|
|
- Multiple monitoring nodes in different locations/segments
|
|
- Each node runs independent health checks
|
|
- Mix of internal and external perspectives
|
|
|
|
#### B. Data Collection
|
|
|
|
Each node collects:
|
|
- Link status
|
|
- Latency measurements
|
|
- Error rates
|
|
- Bandwidth utilization
|
|
- Device health metrics
|
|
|
|
#### C. Consensus Mechanism
|
|
|
|
- Multiple nodes must agree before declaring an outage
|
|
- Voting system implementation:
|
|
- 2/3 node agreement required for issue confirmation
|
|
- Weighted checks based on type
|
|
- Time-based consensus requirements (X seconds persistence)
|
|
|
|
#### D. Alert Verification
|
|
|
|
- Cross-reference multiple data points
|
|
- Check from different network paths
|
|
- Verify both ends of connections
|
|
- Consider network topology
|
|
|
|
#### E. Redundancy
|
|
|
|
- Eliminates single points of failure
|
|
- Nodes distributed across availability zones
|
|
- Independent power and network paths
|
|
|
|
#### F. Central Coordination
|
|
|
|
- Distributed database for state sharing
|
|
- Leader election for coordinating responses
|
|
- Backup coordinators ready to take over
|
|
|
|
### Additional Features
|
|
|
|
- Alarm suppression capabilities
|
|
- Ticket creation system integration |