Pushes to `main` on `LotusGuild/matrix` automatically deploy to the relevant LXC(s) via Gitea webhooks. All 4 LXCs are fully independent — each runs its own webhook listener and deploys only its own files. No cross-LXC SSH dependencies.
### How It Works
1. Push to `LotusGuild/matrix` on Gitea
2. Gitea fires webhooks to all 4 LXCs simultaneously (HMAC-SHA256 validated)
3. Each LXC runs `/usr/local/bin/matrix-deploy.sh` via the `webhook` binary
4. Script does `git fetch + reset --hard origin/main`, checks which files changed, deploys only relevant ones
5. Logs to `/var/log/matrix-deploy.log` on each LXC
### Per-LXC Webhook Endpoints
| LXC | Service | IP | Port | Deploys When Changed |
> LXC 151 uses port **9500** because ports 9000–9004 are occupied by Synapse and Hookshot.
### What Each Deploy Does
**LXC 151 — hookshot/livekit:**
-`hookshot/*.js` changed → runs `hookshot/deploy.sh` (pushes transform functions to Matrix room state via API, requires `MATRIX_TOKEN` in `/etc/matrix-deploy.env`)
-`systemd/livekit-server.service` changed → copies file, `daemon-reload`, sets `/run/livekit-restart-pending` flag (actual restart deferred — see Livekit Graceful Restart below)
**LXC 106 — cinny:**
-`cinny/config.json` → copies to `/var/www/html/config.json`
-`landing/index.html` → copies to `/var/www/matrix-landing/index.html`, `nginx -s reload`
**LXC 110 — draupnir:**
-`draupnir/production.yaml` → extracts live `accessToken` from existing config, overwrites from repo, restores token via `sed`, restarts `draupnir.service`
### Installed Components (per LXC)
-`webhook` binary (Debian package `webhook` v2.8.0) listening on respective port
-`/etc/webhook/hooks.json` — unique HMAC-SHA256 secret per LXC
-`/usr/local/bin/matrix-deploy.sh` — deploy script from this repo
-`/etc/systemd/system/webhook.service` — enabled and running
-`/opt/matrix-config/` — clone of this repo
-`/var/log/matrix-deploy.log` — deploy log
**LXC 151 additionally:**
-`/etc/matrix-deploy.env` — `MATRIX_TOKEN`, `MATRIX_SERVER`, `MATRIX_ROOM` (not in git)
Killing livekit-server while a call is active drops everyone. Instead:
1. Deploy to LXC 151 copies the new `livekit-server.service` and sets a `/run/livekit-restart-pending` flag
2.`livekit-graceful-restart.timer` runs every 5 minutes
3. The timer script counts established TCP connections on port 7881 (`ss -tn state established`)
4. If zero connections → restarts livekit-server and clears the flag
5. If connections exist → logs and exits, retries in 5 minutes
---
## Access Token Rotation
The `MATRIX_TOKEN` in `/etc/matrix-deploy.env` on LXC 151 is a Jared user token used to push hookshot transforms to Matrix room state (requires power level ≥ 50 in Spam and Stuff).
The token in `draupnir/production.yaml` in this repo is **intentionally redacted** (`accessToken: REDACTED`). The deploy script on LXC 110 extracts the live token from the running config before overwriting from the repo, then restores it.
**To rotate the hookshot deploy token (LXC 151):**
1. Generate a new token via Synapse admin API or Cinny → Settings → Security → Manage Sessions
2. SSH to LXC 151 (via `ssh root@10.10.10.4` then `pct enter 151`): `nano /etc/matrix-deploy.env`
When a Matrix client user clicks "Report" on a message, Synapse receives a `POST /_matrix/client/v3/rooms/{roomId}/report/{eventId}` request and stores the report internally. To forward these to the Draupnir management room, a Synapse Python module must be installed on LXC 151.
**Draupnir web server** is enabled (port 8080). The endpoint is:
```
POST http://10.10.10.24:8080/_matrix/draupnir/1/report/{roomId}/{eventId}
```
**To complete Synapse integration (one-time, on LXC 151):**
> Until the Synapse module is installed, abuse reports are stored in Synapse's DB but do NOT appear in the management room. The Draupnir web server is running and ready to receive forwarded reports.
`chat.lotusguild.org` serves a custom Lotus Guild fork of the official `cinnyapp/cinny` main branch. The fork lives at `code.lotusguild.org/LotusGuild/cinny` and tracks upstream via a `git remote add upstream https://github.com/cinnyapp/cinny.git` workflow.
**Upstream monitoring (daily at noon):**
- `cinny-upstream-check.sh` hits the GitHub API and compares the latest `cinnyapp/cinny` main commit against the stored SHA in `/var/lib/cinny-monitor/last-upstream-commit`
- If new commits exist, sends a Matrix message to Spam and Stuff with an `@jared:matrix.lotusguild.org` ping and a link to the commit
- Does **not** auto-build — you review the diff and decide when to merge
**Cinny-build webhook token** (for LotusBot `!cinny-update`): stored in `deploy/hooks-lxc106.json` (`cinny-build` hook, header `X-Build-Token`). LotusBot must POST to `http://10.10.10.6:9000/hooks/cinny-build` with this header.
**Why 8GB RAM:** Vite's build process needs ~6GB Node heap (`--max_old_space_size=6144`) for the rendering-chunks phase. Previously at 4GB — OOM killed during render.
All custom code lives in `src/app/` on the `lotus` branch of `code.lotusguild.org/LotusGuild/cinny`. Changes survive upstream merges as long as they don't conflict with the same files upstream touched.
| Feature | Files | Notes |
|---------|-------|-------|
| **Element Call embed** | `src/app/plugins/call/`, `src/app/hooks/useCallEmbed.ts`, `src/app/components/CallEmbedProvider.tsx` | EC 0.19.3 (`@element-hq/element-call-embedded`), dist copied to `public/element-call/` by vite |
| **DM calls** | `src/app/features/room/Room.tsx`, `src/app/features/room/RoomViewHeader.tsx` | Phone button in DM room header; `useCallStart(true)` passes `intent: StartedByUser`; Room.tsx switches to CallView layout when DM has active call |
| **Picture-in-picture call** | `src/app/components/CallEmbedProvider.tsx` | When navigating away from the call room, the embed shrinks to a 280×158px PiP in the bottom-right. Click navigates back. Implemented via `useEffect` imperatively overriding styles on `callEmbedRef.current` — cannot use a wrapper div because `useCallEmbedPlacementSync` writes `top/left/width/height` directly onto that element |
| **Auto-revert spotlight on screenshare** | `src/app/plugins/call/CallControl.ts` (`onControlMutation`) | When screenshare starts EC normally forces spotlight view. We detect the `screenshare` button going `primary` and after 600ms click `gridButton` to revert to grid layout |
| **PTT (Push-to-Talk)** | `src/app/features/call/CallControls.tsx`, `src/app/state/settings.ts` | Hold-to-talk key (default: Space, configurable). Mutes mic on join; holds mic open while key is held. Badge shows `PTT — Hold SPACE` / `● Live`. Listens on both main window and EC iframe `contentWindow` for key events |
| **PTT badge theming** | `src/app/features/call/CallControls.tsx` | Plain folds `Chip` by default; neon terminal style (`#00FF88`/`#FF6B00`, JetBrains Mono) when `lotusTerminal` setting is on |
| **GIF picker** | `src/app/components/GifPicker.tsx`, `src/app/features/room/RoomInput.tsx` | Giphy JS/React SDK (`@giphy/react-components`, `@giphy/js-fetch-api`, `styled-components`). API key in `config.json` → `gifApiKey`. GIF button appears next to Send only when `gifApiKey` is set. Sends GIF as `m.image` (fetches blob → `mx.uploadContent` → `mx.sendMessage`). `FocusTrap` handles click-outside / Escape to close |
| **GIF picker terminal theme** | `src/app/components/GifPicker.tsx` | When `lotusTerminal` is on: dark navy background (`#060c14`), orange dim border, 4px radius, `// GIF_SEARCH` header, injected `<style>` overrides Giphy SDK SearchBar input (dark bg, orange border/focus ring, JetBrains Mono), custom orange scrollbar |
| **Terminal Design System toggle** | `src/app/state/settings.ts`, `src/app/features/settings/` | `lotusTerminal` boolean setting. When enabled: PTT badge and GIF picker use LotusGuild Terminal Design System aesthetics |
| **LiveKit codec config** | `/etc/livekit/config.yaml` (LXC 151) | `enabled_codecs`: VP8, H264, VP9, Opus, RED for better quality and redundancy |
**Key config values (`/opt/lotus-cinny/config.json`, root — vite copies this to dist):**
```json
{
"defaultHomeserver": 0,
"homeserverList": ["matrix.lotusguild.org"],
"allowCustomHomeservers": false,
"gifApiKey": "AqqDuQwZNjYttz7Mn6ME4JH1bJIuZ5CO"
}
```
> Note: The root `/opt/lotus-cinny/config.json` is what matters — vite copies it to `dist/`. `public/config.json` is not used.
LXC 151 can migrate between Proxmox nodes via HA. After migration, the old livekit-server process on the source node can leave a stale entry holding port 7881 on the destination. Fixed in `livekit-server.service` via:
`net.ipv4.tcp_congestion_control = bbr` must be set on the Proxmox host, not inside an unprivileged LXC. All other sysctl tuning (TCP/UDP buffers, fin_timeout) is applied inside LXC 151.
- [x] TCP retransmit timeout lowered (`tcp_retries2=5`, `tcp_syn_retries=4`, `tcp_keepalive_probes=3`) — stalled outbound federation connections now fail in ~15-30s instead of ~15 min
- [x] Unreachable routes added for servers with asymmetric connectivity (can reach us but we can't reach their federation port) — prevents 90s TCP hangs from being added to lag; defined in `/etc/network/interfaces` post-up hooks and survive reboots (bark.lgbt ×2, parodia.dev, chat.ohaa.xyz, matrix.k8ekat.dev)
- [x] Stuck `device_lists_remote_resync` entries cleared for dead-server users (@dalite:bark.lgbt, @arndot:matrix.goch.social) — device list resync was firing every 30s
- [x] Terms of Service / consent enforcement — `require_at_registration: false`, `block_events_error` set; new users cannot send messages until they explicitly accept via `/_matrix/consent`; Synapse sends a Server Notice DM with the consent URL on first blocked send
> **Disk I/O:** All servers use Ceph-backed storage. Per-device disk I/O metrics are meaningless — use Network I/O panels to see actual storage traffic.
> **`/sync` long-poll:** The Matrix `/sync` endpoint is a long-poll (clients hold it open ≤30s). It is excluded from the High Response Time alert to prevent false positives.
> **Synapse Event Processing Lag** alert fires when `synapse_event_processing_lag > 300s` for 15 consecutive minutes (threshold raised from 120s/5m to reduce noise from normal federation backoff cycling).
>
> Root cause: several federated servers (bark.lgbt, parodia.dev, etc.) have asymmetric connectivity — they can reach us but we cannot reach their federation ports. Each inbound transaction they send resets our backoff to 0, triggering a new outbound connection attempt that hangs for ~90s (TCP `User timeout`). This causes the lag metric to spike. Mitigations in place:
> 1. `tcp_retries2=5` in `/etc/sysctl.d/99-matrix-tuning.conf` — TCP hangs now fail in ~15-30s
> 2. `ip route add unreachable <ip>` in `/etc/network/interfaces` post-up — outbound connections to these servers fail in 0ms (ICMP unreachable)
> 3. Alert threshold raised to 300s/15m — only fires for genuine outages, not normal 10-min backoff cycles
>
> To find new offending servers: `grep "User timeout\|ConnectingCancell" /var/log/matrix-synapse/homeserver.log | grep -oP "\[([^\]]+)\]" | sort | uniq -c | sort -rn | head -20`