diff --git a/README.md b/README.md index 4eba10c..2b7b156 100644 --- a/README.md +++ b/README.md @@ -425,7 +425,7 @@ Periodic `TLS/TCP socket error: Connection reset by peer` in coturn logs. Normal - [x] LiveKit ICE port range expanded to 50000-51000 - [x] LiveKit TURN TTL reduced to 1h - [x] LiveKit VP9/AV1 codecs enabled -- [x] Synapse presence disabled (`presence: enabled: false`) — eliminates federation lag spikes caused by presence EDU bursts to 50+ remote servers +- [x] TCP retransmit timeout lowered (`tcp_retries2=8`, `tcp_syn_retries=4`, `tcp_keepalive_probes=3`) — stalled outbound federation connections to slow/dead remote servers (e.g. `exp.farm`) now fail in ~90s instead of ~15 min, preventing federation queue blockage from presence EDU fan-outs - [ ] BBR congestion control — must be applied on Proxmox host ### Auth & SSO @@ -522,7 +522,7 @@ Periodic `TLS/TCP socket error: Connection reset by peer` in coturn logs. Normal > **`/sync` long-poll:** The Matrix `/sync` endpoint is a long-poll (clients hold it open ≤30s). It is excluded from the High Response Time alert to prevent false positives. -> **Synapse Event Processing Lag** can fire transiently after a Synapse restart while processors drain their backlog. Self-resolves in 10–20 minutes. Root cause of recurring lag spikes was Synapse presence EDU bursts — fixed by disabling presence in `homeserver.yaml` (`presence: enabled: false`). +> **Synapse Event Processing Lag** can fire transiently after a Synapse restart while processors drain their backlog. Self-resolves in 10–20 minutes. Root cause of recurring lag spikes was presence EDU fan-outs to 50+ remote federated servers — when a slow server (e.g. `exp.farm`) hangs a TCP connection, the default `tcp_retries2=15` means Linux retries for ~15 minutes, blocking the federation queue. Fixed by lowering `tcp_retries2=8` in `/etc/sysctl.d/99-matrix-tuning.conf` (stalled connections now fail in ~90s). ---