feat(call): in-call soundboard, quality controls, room call-permissions

Element Call is now consumed as our self-built fork (@lotusguild/element-call-embedded); wire up its previously-dormant capabilities and document the fork as live. Soundboard (P5-15): a call-bar button plays user-uploaded audio clips into the call as a real published track (io.lotus.inject_audio) plus local playback. Clips are uploadable like emoji/sticker packs, stored in io.lotus.soundboard account data (synced across devices). Gated by a Settings toggle + volume. Quality controls (P5-31): per-user mic/screenshare bitrate + screenshare framerate (Settings -> Calls), applied via io.lotus.set_quality clamped to any room cap. Room admins set caps and hard call-permissions (allow_screenshare / allow_camera) in Room Settings -> Voice; the call bar hides blocked buttons. - New: CallSoundboard, useSoundboard, soundboardClips; RoomQuality, useCallQuality, callQuality (+ unit tests). - Optimistic-write RoomQuality admin UI (no stale-state clobber). - Docs: mark EC fork live across README/FEATURES/TODO/BUGS/TESTING; add D2 manual-test steps. Numeric quality caps are client-cooperative; screenshare/camera permissions are hard-enforced server-side (see LotusGuild/matrix voice-limit-guard). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 22:34:17 -04:00
parent 02b2ce8109
commit 7c06b27c73
22 changed files with 1259 additions and 120 deletions
@@ -48,6 +48,9 @@ Built and gate-green; verify per [LOTUS_TESTING.md](./LOTUS_TESTING.md), then th
 | Desktop — proactive update notifications (Tauri)                                  | J1                |
 | Remind Me Later                                                                   | K1                |
 | Mobile Bookmarks access                                                           | E5                |
+| In-Call Soundboard (P5-15, uploadable clips → real call inject)                   | D2-7              |
+| Call Quality Controls (P5-31, user + room-admin caps)                             | D2-8              |
+| Call Permissions (P5-31, hard server-side screenshare/camera policy)              | D2-9              |

 ---

@@ -72,32 +75,32 @@ Status: `[ ]` pending · `[~]` in progress · `[x]` completed

 ### Confirmed facts

-| Finding                                                                                                                                             | Impact                                                                         |
-| --------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------ |
-| **MSC flags ON:** `msc4140` · `msc3771` · `msc3440.stable` · `msc4133.stable` · `simplified_msc3575` · `msc4222` · `msc3266` · `msc3401_matrix_rtc` | All safe to use now                                                            |
-| **MSC flags OFF:** `msc4306` (thread subscriptions) · `msc3882` · `msc3912` · `msc4155`                                                             | These features are BLOCKED                                                     |
-| **MSC3266** room summary: flag `msc3266_enabled: true` set but `GET /v1/rooms/{id}/summary` still returns 404 (M_UNRECOGNIZED)                      | Room Preview BLOCKED — endpoint not implemented in Synapse 1.155               |
-| **MSC3892** relation redaction: not in flags                                                                                                        | Reaction Redaction feature BLOCKED                                             |
-| **MSC4260** report user: `POST /_matrix/client/v3/users/{userId}/report` returns **200** ✅                                                         | **Report User UNBLOCKED** — endpoint live since Synapse 1.133; ready to build  |
-| **MSC4151** report room: HTTP 405 on GET = endpoint exists (POST only)                                                                              | Report Room live ✅                                                            |
-| `folds AvatarImage` does NOT accept children                                                                                                        | Add frame/overlay inside `UserAvatar.tsx` itself — optional `frameName` prop   |
-| No in-app toast system exists (was)                                                                                                                 | Built `ToastProvider` + Jotai queue; at `App.tsx:65`                           |
-| `useUnverifiedDeviceCount()` hook exists                                                                                                            | `src/app/hooks/useDeviceVerificationStatus.ts:65-106`                          |
-| Voice player: `AudioContent.tsx:44-223`                                                                                                             | Playback rate on hidden `<audio>` at line 217                                  |
-| `CallControl.setMicrophone(bool)` at `CallControl.ts:206-212`                                                                                       | For AFK auto-mute                                                              |
-| `CallControl.toggleSound()` at `CallControl.ts:230-251`                                                                                             | Push-to-deafen — just wire a hotkey to this                                    |
-| matrix-js-sdk has NO arbitrary profile field methods                                                                                                | Use `mx.http.authedRequest()` for MSC4133                                      |
-| Sanitizer (`sanitize.ts`) allows table, div, span, a, code, hr                                                                                      | LFG HTML card is safe locally; test on Element/FluffyChat                      |
-| Sanitizer STRIPS `<math>`/MathML tags                                                                                                               | Math/LaTeX task must also modify sanitizer                                     |
-| Service worker EXISTS at `src/sw.ts`                                                                                                                | Quick-reply task: add `notificationclick` handler                              |
-| `knockSupported()` utility exists at `matrix.ts:376-391`                                                                                            | Knock UX: only need "Request to Join" in `RoomIntro.tsx`                       |
-| `KeywordMessages.tsx` already has custom keyword push rules                                                                                         | Full push rule editor: only non-keyword rule types need new UI                 |
-| `getMatrixToRoom()` in `matrix-to.ts` generates invite URLs                                                                                         | Invite link: just add QR code to room settings                                 |
-| Cindy CANNOT inject audio into EC call stream                                                                                                       | In-call soundboard must be redesigned as local-only                            |
-| Folds uses vanilla-extract in non-TDS, NOT CSS custom properties                                                                                    | Custom accent color: must create new vanilla-extract theme variant dynamically |
-| Theme presets need ~50 CSS custom properties each                                                                                                   | Significant design work before coding                                          |
-| `useCallSpeakers.ts` CSS MutationObserver polling                                                                                                   | Visual speaking indicator: TDS ring animation on top of existing data          |
-| MSC3489/3672 live location: BOTH false on server                                                                                                    | Live Location BLOCKED                                                          |
+| Finding                                                                                                                                                  | Impact                                                                                                  |
+| -------------------------------------------------------------------------------------------------------------------------------------------------------- | ------------------------------------------------------------------------------------------------------- |
+| **MSC flags ON:** `msc4140` · `msc3771` · `msc3440.stable` · `msc4133.stable` · `simplified_msc3575` · `msc4222` · `msc3266` · `msc3401_matrix_rtc`      | All safe to use now                                                                                     |
+| **MSC flags OFF:** `msc4306` (thread subscriptions) · `msc3882` · `msc3912` · `msc4155`                                                                  | These features are BLOCKED                                                                              |
+| **MSC3266** room summary: flag `msc3266_enabled: true` set but `GET /v1/rooms/{id}/summary` still returns 404 (M_UNRECOGNIZED)                           | Room Preview BLOCKED — endpoint not implemented in Synapse 1.155                                        |
+| **MSC3892** relation redaction: not in flags                                                                                                             | Reaction Redaction feature BLOCKED                                                                      |
+| **MSC4260** report user: `POST /_matrix/client/v3/users/{userId}/report` returns **200** ✅                                                              | **Report User UNBLOCKED** — endpoint live since Synapse 1.133; ready to build                           |
+| **MSC4151** report room: HTTP 405 on GET = endpoint exists (POST only)                                                                                   | Report Room live ✅                                                                                     |
+| `folds AvatarImage` does NOT accept children                                                                                                             | Add frame/overlay inside `UserAvatar.tsx` itself — optional `frameName` prop                            |
+| No in-app toast system exists (was)                                                                                                                      | Built `ToastProvider` + Jotai queue; at `App.tsx:65`                                                    |
+| `useUnverifiedDeviceCount()` hook exists                                                                                                                 | `src/app/hooks/useDeviceVerificationStatus.ts:65-106`                                                   |
+| Voice player: `AudioContent.tsx:44-223`                                                                                                                  | Playback rate on hidden `<audio>` at line 217                                                           |
+| `CallControl.setMicrophone(bool)` at `CallControl.ts:206-212`                                                                                            | For AFK auto-mute                                                                                       |
+| `CallControl.toggleSound()` at `CallControl.ts:230-251`                                                                                                  | Push-to-deafen — just wire a hotkey to this                                                             |
+| matrix-js-sdk has NO arbitrary profile field methods                                                                                                     | Use `mx.http.authedRequest()` for MSC4133                                                               |
+| Sanitizer (`sanitize.ts`) allows table, div, span, a, code, hr                                                                                           | LFG HTML card is safe locally; test on Element/FluffyChat                                               |
+| Sanitizer STRIPS `<math>`/MathML tags                                                                                                                    | Math/LaTeX task must also modify sanitizer                                                              |
+| Service worker EXISTS at `src/sw.ts`                                                                                                                     | Quick-reply task: add `notificationclick` handler                                                       |
+| `knockSupported()` utility exists at `matrix.ts:376-391`                                                                                                 | Knock UX: only need "Request to Join" in `RoomIntro.tsx`                                                |
+| `KeywordMessages.tsx` already has custom keyword push rules                                                                                              | Full push rule editor: only non-keyword rule types need new UI                                          |
+| `getMatrixToRoom()` in `matrix-to.ts` generates invite URLs                                                                                              | Invite link: just add QR code to room settings                                                          |
+| ~~Cindy CANNOT inject audio into EC call stream~~ **UNBLOCKED by EC fork** — `io.lotus.inject_audio` widget action publishes a clip as a real call track | In-call soundboard CAN now mix into the call (no longer local-only); needs cinny UI to drive the action |
+| Folds uses vanilla-extract in non-TDS, NOT CSS custom properties                                                                                         | Custom accent color: must create new vanilla-extract theme variant dynamically                          |
+| Theme presets need ~50 CSS custom properties each                                                                                                        | Significant design work before coding                                                                   |
+| `useCallSpeakers.ts` CSS MutationObserver polling                                                                                                        | Visual speaking indicator: TDS ring animation on top of existing data                                   |
+| MSC3489/3672 live location: BOTH false on server                                                                                                         | Live Location BLOCKED                                                                                   |

 ---

@@ -266,12 +269,17 @@ Features:

 ---

-### [ ] P5-15 · In-Call Soundboard
+### [~] P5-15 · In-Call Soundboard — IMPLEMENTED (⚠️ awaiting live verification, D2-7)

-**What:** Grid of short audio clips playable into the call audio stream via Web Audio API (AudioBufferSourceNode → MediaStreamDestinationNode → mixed with mic). Built-in clips + user-uploadable custom clips (stored as mxc://). Accessible from call controls bar.  
-**[AUDIT REQUIRED]** Verify the Element Call integration exposes the mic MediaStream for mixing. This is the highest-risk part of this feature.  
-**🔱 [EC-FORK]** Owning the EC source (see [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md)) would unblock real audio-injection — a proper soundboard mixed into the call — which is impossible against the prebuilt bundle today.  
-**Complexity:** High.
+**What:** Soundboard button in the call controls bar → popout grid of the user's clips; clicking one plays it **into the call** as a real published track (peers hear it) and locally (presser hears it). Clips are **user-uploadable, just like custom emojis/stickers**.  
+**🔱 [EC-FORK] Fork side + cinny side DONE.** The fork ships `io.lotus.inject_audio` (`LotusWidgetActions.InjectAudio`, allow-listed in `widget.ts`), armed via the `lotusAudioInject=1` flag; it publishes a clip as a separate LiveKit track — a **real** in-call soundboard mixed into the call, not local-only. cinny now drives it.  
+**Shipped (cinny):**
+
+- Clips stored in `io.lotus.soundboard` account data → **synced across devices like emoji/sticker packs** (`useSoundboard` hook; `AccountDataEvent.LotusSoundboard`).
+- Upload audio (≤1 MB, ≤40 clips) → `mx.uploadContent` → mxc; play resolves mxc → authed download → `blob:` object URL (the widget can't fetch authenticated media itself) → `control.injectAudio(url, volume)` + local playback.
+- `CallSoundboard.tsx` popout in the call bar (upload / play / delete), gated on the `soundboardEnabled` setting (Settings → General → Calls, + volume slider).  
+  **Remaining:** a dedicated Settings management page (optional — upload/delete already live in the popout); a small default clip set; live verification (D2-7). Files: `utils/soundboardClips.ts`, `hooks/useSoundboard.ts`, `features/call/CallSoundboard.tsx`, `plugins/call/CallControl.ts#injectAudio`.  
+  **Complexity:** Medium — done.

 ---

@@ -287,26 +295,38 @@ Features:
 ### [x] P5-30 · Advanced ML Noise Suppression (Krisp-style)

 **What:** High-end background noise cancellation using a pre-trained ML model (RNNoise) running in the browser. Removes dogs, fans, and keyboard clicks from the mic stream.  
-**Shipped:** 3-tier setting (Off / Browser-native / ML) in Settings → General → Calls. ML tier injects a same-origin pre-init shim into the vendored Element Call `index.html` that monkeypatches `getUserMedia` and routes the captured mic through an RNNoise `AudioWorklet` before LiveKit publishes — no EC fork required. See LOTUS_FEATURES.md → "Noise Suppression (Advanced Multi-Tier)".  
-**🔱 [EC-FORK]** Once we own the EC source (see [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md)), denoise should become a first-class audio stage **inside** EC instead of an `index.html` getUserMedia monkeypatch — more robust, survives reconnects (fixes the A7 mic-after-reconnect bug), and removes the build-time injection hack.  
-**Key decision:** LiveKit's Krisp filter is LiveKit-Cloud-only (we self-host the SFU); EC's own RNNoise PR #3892 is unmerged. The shim is the same post-capture pipeline #3892 uses, executed from the realm we control, so it survives EC version bumps.  
-**AEC note (resolved-as-accepted):** WebAudio capture routing can weaken browser AEC — same tradeoff as EC's upstream feature; mitigated by keeping `echoCancellation`/`autoGainControl` on the raw capture and labeling the tier "beta".
+**Shipped:** 3-tier setting (Off / Browser-native / ML) in Settings → General → Calls.  
+**🔱 [EC-FORK] DONE — moved in-source (2026-06).** ML denoise is now a first-class audio stage **inside** the forked Element Call: a LiveKit `TrackProcessor<Audio>` activated by `lotusDenoiseSource=1` (cinny sets it when ML is selected). The old build-time `getUserMedia`/`index.html` monkeypatch is **removed**. Because EC re-runs the processor on every (re)publish, denoise now **survives reconnects and mic-device switches** — this is the A7 fix (see `LOTUS_BUGS.md` A7, `LOTUS_TESTING.md` §D2-1). The processor degrades to the raw mic rather than going silent.  
+**Key decision:** LiveKit's Krisp filter is LiveKit-Cloud-only (we self-host the SFU); EC's own RNNoise PR #3892 is unmerged. Owning the fork let us implement the in-source stage directly.

-**Model Roadmap (priority order):**
+**Models — all in-source in the fork:**

- [ ] **Verify DTLN** (16 kHz narrowband fix) in a real call before investing further — wired but unverified.
- [ ] **DeepFilterNet 3** — best self-hostable upgrade: Rust→WASM, CPU real-time, 48 kHz fullband. Effort: self-host `df_bg.wasm` + DFN3 ONNX model, wire a 48 kHz worklet.
- [ ] **Desktop-only / HW-gated:** FRCRN or NVIDIA Maxine (RTX/Tensor only) — impossible in-browser; would run in Tauri Rust backend + bridge a virtual mic into the webview. Must detect capability and only offer on supported hardware; web falls back to RNNoise.
+- [x] **RNNoise** (48 kHz, default) · **Speex** (48 kHz) · **DTLN** (16 kHz) · **DeepFilterNet 3** (48 kHz) — all four wired and selectable.
+- [ ] **Open verification:** real-call **audio-quality** comparison across the four models (RNNoise output is known-weak). Track under the denoise quality project, `LOTUS_TESTING.md` §D2-1 / J2.
+- [ ] **Desktop-only / HW-gated (future):** FRCRN or NVIDIA Maxine (RTX/Tensor only) — impossible in-browser; would run in the Tauri Rust backend + bridge a virtual mic into the webview. Detect capability; web falls back to RNNoise.
 - **Excluded:** Krisp (LiveKit Cloud only); FRCRN/Maxine on web (GPU/server-bound).

 ---

-### [ ] P5-31 · Granular Voice & Screenshare Quality Controls (Discord-style)
+### [~] P5-31 · Granular Voice & Screenshare Quality Controls — IMPLEMENTED (⚠️ awaiting live verification, D2-8)

-**What:** Let users (or room admins via room settings) adjust audio bitrates (e.g., 64kbps to 512kbps) and screenshare quality (resolution: 720p/1080p/Source, framerate: 15/30/60fps).  
-**Note:** Requires tight integration with the LiveKit SFU and custom state events for per-room quality caps.  
-**[AUDIT REQUIRED]** Must verify if current `lk-jwt-service` can be extended with custom bitrate/resolution claims or if a new sidecar (similar to `voice-limit-guard`) is needed for server-side enforcement.  
-**Complexity:** Extreme.
+**What:** Let users (and room admins) adjust audio bitrate and screenshare bitrate/framerate.  
+**🔱 [EC-FORK] Fork side + client side DONE.** The fork ships `io.lotus.set_quality` (`LotusWidgetActions.SetQuality`) that applies audio/screenshare encoding params (`RTCRtpSender.setParameters`, all simulcast encodings, re-applied on `TrackUnmuted`/republish) inside EC. cinny now drives it.
+
+**Shipped (cinny):**
+
+1. **User settings** (Settings → General → Calls): Microphone Bitrate, Screenshare Bitrate, Screenshare Framerate (`callAudioBitrate` / `screenshareBitrate` / `screenshareFramerate`).
+2. **Room-admin caps**: `io.lotus.room_quality` state event (`StateEvent.LotusRoomQuality`) + `RoomQuality.tsx` in Room Settings → General → Voice (mirrors `RoomVoiceLimit`).
+3. **Apply logic**: `useCallQuality` (wired in `CallEmbedProvider`'s `CallUtils`) builds `min(user setting, room cap)` and sends `io.lotus.set_quality` on join / when settings change (`utils/callQuality.ts`, unit-tested).
+
+**Server-side enforcement (DONE — matrix repo):** extended `voice-limit-guard.py` (LXC 151) to also read `io.lotus.room_quality` and hard-enforce a **publish-source policy** for ALL clients.
+
+- **Reality (researched, primary-source, LiveKit 1.9.11):** numeric bitrate/fps caps **cannot** be hard-enforced server-side — LiveKit is a pure SFU (forwards, never transcodes); there is NO bitrate/fps field in the JWT grant, `RoomConfiguration`, server `limit:` config, or any admin RPC, and stock Element Call ignores room metadata / custom claims for publish quality. So numeric caps stay **cooperative** (our fork honors them via `min()` → `set_quality`, already shipped).
+- **What IS hard-enforced cross-client:** `VideoGrant.canPublishSources`. The guard holds the LiveKit secret, so when `io.lotus.room_quality` sets `allow_screenshare:false` / `allow_camera:false` it re-signs the issued JWT with a narrowed source list → the SFU refuses those tracks for **every** client (Element, FluffyChat, our fork). Mic always kept. Fail-open; unit-tested (`livekit/test_voice_limit_guard.py`). Admin UI: Room Settings → Voice → **Call Permissions** switches. cinny also hides the blocked buttons.
+- **Live (mid-call) enforcement — DONE:** the JWT re-sign covers new joins; for participants **already in the call**, a background reconcile loop in the guard calls LiveKit `UpdateParticipant` every ~3 s to narrow `canPublishSources`, which unpublishes an in-progress screenshare/camera **server-side for all clients** and blocks re-publish (verified LiveKit 1.9.11 auto-unpublishes on permission narrowing). Only removes forbidden sources (never grants), preserves other permission flags, no-ops once compliant. So flipping a room audio-only kills live cameras/screenshares within ~one interval.
+- **Not enforceable / deferred:** numeric server enforcement (impossible — see above); screenshare **resolution** control (`set_quality` covers bitrate + framerate; resolution needs a `getDisplayMedia` hook inside the fork).
+
+**Complexity:** DONE — client (cooperative numeric caps) + server (hard publish-source policy). Only the physically-impossible numeric server enforcement is out of scope.

 ---

@@ -540,7 +560,7 @@ Exhaustive, low-level implementation details for backlog items. Follow these pat

 > ⚠️ **[Gemini_Found — CORRECTED]** Gemini originally suggested using LiveKit's `LocalAudioTrack.replaceTrack()` to mix audio into the call stream. This is **not possible** from Lotus Chat's realm: Element Call runs in a **cross-origin iframe** controlled via `matrix-widget-api` (postMessage). LiveKit's JS SDK and its `LocalAudioTrack` live inside EC's sandboxed context — inaccessible from our code. This directly contradicts the confirmed constraint already listed in the Server Capabilities table: _"Cindy CANNOT inject audio into EC call stream — In-call soundboard must be redesigned as local-only."_ The soundboard must be a local-playback-only feature (output through the user's speakers, not mixed into the call audio stream).
 >
-> 🔱 **[EC-FORK — partial correction]** The "cross-origin" claim above is **outdated**: EC is now **same-origin** / self-hosted (`iframe.sandbox` has `allow-same-origin`; we read `contentDocument`). The _practical_ blocker still holds — LiveKit's `LocalAudioTrack` lives in EC's **module scope** (not on `window`), so it's unreachable from cinny even same-origin. **Owning the EC source** (see [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md)) is the path to a real call-audio-inject API, which would unblock a true in-call soundboard.
+> 🔱 **[EC-FORK — RESOLVED]** Both the original claim and the earlier "practical blocker still holds" correction are now **outdated**. EC is same-origin **and** we own the source, so we no longer reach into EC's module scope from cinny — instead the fork **exposes the inject point itself**: the `io.lotus.inject_audio` widget action (`LotusWidgetActions.InjectAudio`) publishes a clip as a separate LiveKit track from inside EC. A **real** in-call soundboard (mixed into the call, not local-only) is therefore unblocked; only the cinny-side UI remains (see P5-15 above). The capability ships dormant today.

 ---