diff --git a/HANDOFF_ELEMENT_CALL_FORK.md b/HANDOFF_ELEMENT_CALL_FORK.md new file mode 100644 index 000000000..e01d47eee --- /dev/null +++ b/HANDOFF_ELEMENT_CALL_FORK.md @@ -0,0 +1,270 @@ +# HANDOFF — Forking & Self-Building Element Call ("Lotus Call") + +> **Audience:** a fresh Claude/engineer session with **no prior context** on this +> project. Read this top-to-bottom before touching anything. This document is the +> single source of truth for the Element Call (EC) fork initiative. +> +> **Status:** NOT STARTED. This is the planning/handoff artifact. Created +> 2026-06 from the Lotus Chat (`LotusGuild/cinny`) repo. + +--- + +## 0. TL;DR / The Goal + +We embed **Element Call** (the Matrix group-VoIP/video app) inside Lotus Chat to +power voice/video channels. Today we consume Element's **pre-compiled npm +bundle** and can only steer it from the outside (a limited widget API + fragile +same-origin DOM hacks). Several in-call problems are **unfixable from outside** +because they live in EC's compiled JS. + +**We want true ownership: fork `element-hq/element-call`, build it from source +ourselves, host our build, and replace the npm bundle with our fork.** Then +every in-call behavior becomes editable code. + +**This requires standing up a brand-new repo and build pipeline for our EC fork.** + +--- + +## 1. Why fork? (What we cannot fix today) + +These came out of live testing and are documented in `LOTUS_BUGS.md` → +"Known Element Call iframe limitations": + +| Issue | What's wrong | Why outside-fixes fail | +| :----------------------------------------------------- | :------------------------------------------------------------------------- | :---------------------------------------------------------------------------------------------------------------------------------------------------------- | +| **A6** — avatar decorations in-call | Our profile-decoration overlays don't appear on in-call video tiles | The video grid is rendered by EC's React app inside the iframe. We can only inject overlay DOM (fragile) — we can't make it a first-class part of the tile. | +| **A5** — focus camera / fullscreen during screenshare | Can't reliably spotlight a participant's camera while someone screenshares | EC's **layout logic** (screenshare priority, spotlight) is compiled JS we don't control. We currently DOM-click tiles as a hack. | +| **A7** — mic dead after EC's "Reconnect" | After EC's own mid-call reconnect, the local mic isn't re-published | EC's reconnect/track-republish path is internal. (Partly entangled with our denoise shim — see §6.) | +| Native theming | EC's UI doesn't match Lotus design; we inject CSS hacks | Real theming needs source-level component/token changes. | +| Decorations, custom controls, custom layouts, branding | all blocked | all require source access | + +**Bottom line:** the iframe is **same-origin** (we self-host it), so we can read +and even write its DOM — but we **do not own its source**, so we can't change its +**behavior/logic**, only poke at its rendered output. Forking removes that wall. + +--- + +## 2. How EC is integrated TODAY (the current architecture) + +Understand this fully before changing it — the fork must slot into the same +integration seams. + +### 2.1 Where the EC bundle comes from + +- npm package: **`@element-hq/element-call-embedded`**, pinned to **`0.20.1`** in + `cinny/package.json` (line ~104). +- It ships a **pre-built `dist/`**. At cinny build time, + `vite-plugin-static-copy` copies that `dist/` flat into + **`public/element-call/`** (see `cinny/vite.config.js`, the `copyFiles` + target with `rename: { stripBase: 4 }` — note the stripBase gotcha documented + there; getting this wrong 404s the widget). +- It is **NOT committed** to git (`git ls-files public/element-call` → 0). It's a + build artifact materialized from `node_modules`. + +### 2.2 How EC is loaded & controlled + +- The widget iframe `src` is **same-origin**: + `${BASE_URL}/public/element-call/index.html?` (see + `cinny/src/app/plugins/call/CallEmbed.ts`, `getWidget()` / + `getIframe()`). Sandbox: `allow-forms allow-scripts allow-same-origin +allow-popups allow-modals allow-downloads`; `allow="microphone; camera; +display-capture; autoplay; clipboard-write;"`. +- **Control surface #1 — the official widget API** (`matrix-widget-api`): + `ClientWidgetApi` + a custom `CallWidgetDriver`. This is the robust, + version-stable channel (theme change, hangup, capabilities, timeline events). + Files: `plugins/call/CallEmbed.ts`, `plugins/call/CallWidgetDriver.ts`, + `plugins/call/utils.ts` (capabilities), `plugins/call/CallControl.ts`. +- **Control surface #2 — same-origin DOM poking** (fragile, version-coupled): + reading `iframe.contentDocument` to detect speakers/mute state and + `.click()`-ing tiles to focus a camera. Files: + `hooks/useCallSpeakers.ts` (reads `[data-muted]`, `[data-video-fit]`), + `plugins/call/CallControl.ts` (`focusCameraParticipant` — tile selectors). + **These selectors break on every EC version bump.** A fork lets us replace + these hacks with real APIs/props. +- **Control surface #3 — URL params + build-time injection** for our denoise + shim (see §6). + +### 2.3 Full file inventory (everything that touches EC in cinny) + +Plugin / core: + +- `src/app/plugins/call/CallEmbed.ts` — iframe creation, widget API wiring, theme sync, hangup, load watchdog/self-heal, denoise URL params. +- `src/app/plugins/call/CallControl.ts` — control state + **DOM-poking** (`focusCameraParticipant`, spotlight). +- `src/app/plugins/call/CallControl.tsx` _(call-status variant)_ and `features/call-status/CallControl.tsx`. +- `src/app/plugins/call/CallWidgetDriver.ts` — widget driver (capabilities, event relay). +- `src/app/plugins/call/utils.ts` — widget capabilities set. +- `src/app/plugins/call/hooks.ts`, `index.ts` — plugin exports/hooks. +- `src/app/state/callEmbed.ts` — jotai atoms for the active embed. + +React / UI: + +- `src/app/components/CallEmbedProvider.tsx` — the big one: incoming-call ring/banner, RTCNotification + **RTCDecline** listeners, PiP, mute badges, fullscreen, ringtones. +- `src/app/features/call/CallView.tsx` — prescreen lobby vs joined (the iframe placement target), load-error recovery UI. +- `src/app/features/call/CallControls.tsx` — in-call control bar (mic/cam/deafen/screenshare/fullscreen/more/PiP). +- `src/app/features/call/CallMemberCard.tsx` — **lobby** participant roster (this is where `AvatarDecoration` works today; in-call grid is EC's). +- `src/app/features/call/PrescreenControls.tsx` — join controls. +- `src/app/features/call-status/*` — `CallStatus.tsx`, `MemberGlance.tsx` (the "Focus camera" menu lives here), `LiveChip.tsx`. +- `src/app/features/room-nav/RoomNavItem.tsx`, `features/room/Room.tsx`, `features/room/RoomViewHeader.tsx`, `pages/client/space/Space.tsx`, `pages/CallStatusRenderer.tsx`, `pages/Router.tsx` — call entry points / status surfacing. + +Hooks: + +- `src/app/hooks/useCallEmbed.ts`, `useCall.ts`, `useCallSpeakers.ts` (DOM-poking), `useCallJoinLeaveSounds.ts`, `useAfkAutoMute.ts`. + +Build: + +- `cinny/vite.config.js` — `copyFiles` (EC dist copy) + `lotusDenoise()` plugin (denoise asset copy + index.html shim injection, in `closeBundle`). + +Utils: + +- `src/app/utils/ringtones.ts`, `utils/denoisePipeline.ts`, `utils/lotusDenoiseUtils.ts`. + +--- + +## 3. Hosting / infra context (the OTHER repo) + +There are **two repos**: + +1. **`LotusGuild/cinny`** (`/root/code/cinny`) — this Lotus Chat fork. Consumes EC. +2. **`LotusGuild/matrix`** (`/root/code/matrix`) — the **infra/homeserver** repo. + Subdirs: `livekit/` (the SFU EC talks to), `deploy/`, `draupnir/`, + `hookshot/`, `landing/`, `matrixbot/`, `systemd/`. Gitea remote + `code.lotusguild.org/LotusGuild/matrix`, branch `main`. + +EC needs a **LiveKit SFU** + the **livekit-jwt-service**; those live in +`matrix/livekit/`. A self-hosted EC build must be configured to point at our +homeserver (`matrix.lotusguild.org` / synapse) and our LiveKit. EC's runtime +`config.json` (homeserver, livekit URL, feature flags) is part of what we'll own +once we build it ourselves. + +Deployment today: `chat.lotusguild.org` (the cinny web build, which embeds EC at +`/public/element-call/`). cinny-desktop (`LotusGuild/cinny-desktop`, a Tauri +wrapper, bumped by cinny CI) embeds the same. + +--- + +## 4. The plan (proposed — confirm with the user before executing) + +### Decision: **YES, create a new repo.** `LotusGuild/element-call` + +Rationale: EC is a large standalone app (React + LiveKit client SDK + matrixRTC + +its own Vite build + heavy deps). Keep it out of cinny so cinny's build stays +clean — cinny keeps consuming a **built EC `dist/`**, exactly as today, just +sourced from **our fork** instead of npm. + +### Phase 0 — Recon (no code) + +- Fork `github.com/element-hq/element-call` → `LotusGuild/element-call` on Gitea. +- Pin to the upstream tag matching **0.20.1** (`element-call-embedded` 0.20.1's + corresponding `element-call` release) so behavior matches what's shipping now. + Verify the embedded-package version ↔ element-call repo tag mapping. +- Read EC's own build docs: it builds the "embedded" widget bundle (the thing + currently published as `@element-hq/element-call-embedded`). Reproduce that + build locally and confirm the output matches `public/element-call/` today. +- **License:** element-call is **AGPL-3.0**, same as Lotus Chat — compatible. + Our fork must remain AGPL and publish source. + +### Phase 1 — Reproduce current behavior from our fork (parity, no features) + +- Build our fork's embedded bundle; wire cinny to consume it instead of the npm + package (see §5 for the consumption options). Smoke-test: a call works exactly + as today (web + desktop), denoise shim still injects, widget API + theme still + work. **No behavior change yet** — this de-risks the swap. + +### Phase 2 — Replace the outside hacks with source-level features + +Tackle the §1 issues in EC's source: + +- **A6:** render avatar decorations as part of the video-tile component + (read decoration data we pass in via widget data / URL param / a small bridge). +- **A5:** fix focus/spotlight + screenshare-coexistence in EC's layout code; + expose a clean widget action so cinny can trigger it (kill the DOM `.click()`). +- **A7:** fix mic re-publish on reconnect; reconcile with our denoise shim (§6) — + ideally move denoise INTO the fork as a real audio-processing step instead of a + `getUserMedia` monkeypatch. +- Native Lotus theming/branding at the source (kill the injected-CSS hacks). +- Then retire the DOM-poking in `useCallSpeakers.ts` / `CallControl.ts` in favor + of real widget messages. + +### Phase 3 — Maintenance posture + +- Decide rebase cadence vs. upstream element-call releases. Keep customizations + isolated (feature flags / minimal-diff patches) to ease rebasing. +- CI in the new repo builds + publishes the embedded dist as a versioned + artifact; cinny CI consumes a pinned version. + +--- + +## 5. How cinny should consume the fork (pick one — decide with user) + +1. **Private npm package** (mirror the current model): our fork's CI publishes + `@lotusguild/element-call-embedded` to a registry; cinny depends on it and + `viteStaticCopy` keeps working almost unchanged. _Cleanest swap; needs a + registry._ +2. **Git submodule + build in cinny CI:** add the fork as a submodule, build it + during cinny's build, copy its `dist/` to `public/element-call/`. _No + registry; heavier cinny CI._ +3. **CI artifact copy:** fork CI uploads a `dist` tarball; cinny CI downloads a + pinned version at build. _Decoupled; needs artifact plumbing._ + +**Recommendation: Option 1** — it changes the least in cinny (just swap the +package name in `package.json` + the `viteStaticCopy` src path) and preserves the +clean cinny/EC separation. + +--- + +## 6. The denoise shim — critical interaction (don't break this) + +Lotus ships ML noise suppression by **injecting a same-origin pre-init shim into +EC's `index.html` at build time** (cinny `vite.config.js` → `lotusDenoise()`, +`closeBundle`). The shim monkeypatches `getUserMedia` **before EC captures the +mic** and routes audio through RNNoise/Speex/DTLN AudioWorklets, then EC/LiveKit +publishes the processed track. It's activated via URL params +(`lotusDenoise=ml&lotusModel=…&lotusGate=…`) set in `CallEmbed.ts`. + +- Assets copied to `public/element-call/denoise/` at build (sapphi RNNoise/Speex/ + gate worklets + `@workadventure/noise-suppression` DTLN tree). +- Related: `utils/denoisePipeline.ts`, `utils/lotusDenoiseUtils.ts`, + `settings/general/DenoiseTester.tsx`, `VoiceMessageRecorder.tsx`. +- **Known issues:** denoise quality is still poor (tracked separately); and the + mic-after-reconnect bug (A7) is suspected to involve the shim's getUserMedia + patch handing back a stale processed stream when EC re-acquires the mic. + +**Once we own the fork, the right move is to make denoise a first-class +audio-processing stage inside EC** (not an index.html monkeypatch) — more robust, +survives reconnects, and removes the build-time injection hack. Until then, the +fork's `index.html` must remain injectable the same way, or the shim must be +re-homed into the fork. + +--- + +## 7. Doc-accuracy notes / corrections for the new session + +- `LOTUS_TODO.md` (~line 533) calls EC a **"cross-origin iframe"** — **outdated.** + EC is **same-origin** today (self-hosted under our domain; + `iframe.sandbox` includes `allow-same-origin`; we read `contentDocument`). The + _practical_ point it makes still holds: **LiveKit's `LocalAudioTrack` lives in + EC's module scope**, not on `window`, so we can't reach it from cinny even + same-origin — which is exactly why the in-call soundboard had to be + local-playback-only, and another reason to fork (a fork could expose a real + audio-inject API). +- `LOTUS_FEATURES.md` documents the EC upgrade history (0.16.3 → 0.19.4 → + 0.20.1), the dark-mode CSS injection, and AFK auto-mute — all relevant prior + art for what the fork must preserve. +- `LOTUS_TESTING.md` §D is the **EC regression sweep** to re-run after the fork + swap (Phase 1 parity check). + +--- + +## 8. First actions for the new session + +1. Read this file, then skim §2.3's files in `cinny` to internalize the seams. +2. Confirm with the user: new repo name, consumption model (§5), rebase cadence. +3. Phase 0: fork element-call, map 0.20.1 ↔ element-call tag, reproduce the + embedded build locally, diff against `public/element-call/`. +4. Phase 1: wire cinny to the fork, run `LOTUS_TESTING.md` §D parity sweep. +5. Only then start Phase 2 features (A5/A6/A7, theming, denoise-in-source). + +**Cross-references:** `LOTUS_BUGS.md` (EC limitations + verify queue), +`LOTUS_TODO.md` (denoise/soundboard constraints), `LOTUS_FEATURES.md` (EC history), +`LOTUS_TESTING.md` §D (regression sweep). Infra: `/root/code/matrix` (`livekit/`, +`deploy/`). diff --git a/LOTUS_BUGS.md b/LOTUS_BUGS.md index d1efbd961..ae59a5b82 100644 --- a/LOTUS_BUGS.md +++ b/LOTUS_BUGS.md @@ -38,9 +38,15 @@ Implemented and gate-green; confirm each per `LOTUS_TESTING.md`, then delete the ## 🧩 Known Element Call iframe limitations (not fixable from our side) -The in-call participant grid is rendered **inside EC's separate-origin iframe**, -which we can style/place around but cannot inject UI into. Consequences from -testing: +> 🔱 **[EC-FORK]** These are the motivating issues for the **Element Call fork +> initiative** — see [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md). +> Once we build EC from our own source, A5/A6/A7 below become normal code fixes. +> (Correction: the iframe is actually **same-origin** / self-hosted — we just +> don't own EC's compiled source today.) + +The in-call participant grid is rendered **inside EC's iframe** (a pre-built npm +bundle we don't own), which we can style/place around but cannot change the logic +of. Consequences from testing: - **A5 — "Focus camera":** EC already supports native tile-pinning (click a video tile). Our bottom-bar "Focus camera" is a programmatic wrapper that clicks that diff --git a/LOTUS_FEATURES.md b/LOTUS_FEATURES.md index e5ef926ce..b03ff2a3f 100644 --- a/LOTUS_FEATURES.md +++ b/LOTUS_FEATURES.md @@ -322,6 +322,11 @@ Users can set a custom background color for `@mention` chips that highlight thei ## Voice / Video Call Improvements +> 🔱 **[EC-FORK]** Element Call is embedded as a **pre-built npm bundle** today. +> The plan to fork & self-build it from source for true ownership — and which of +> the items below would move into our EC source — is in +> [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md). + ### Element Call Upgrade Upgraded embedded Element Call widget from **0.16.3** to **0.19.4**. diff --git a/LOTUS_TODO.md b/LOTUS_TODO.md index a5b910bf2..41461ffc8 100644 --- a/LOTUS_TODO.md +++ b/LOTUS_TODO.md @@ -264,6 +264,7 @@ Features: **What:** Grid of short audio clips playable into the call audio stream via Web Audio API (AudioBufferSourceNode → MediaStreamDestinationNode → mixed with mic). Built-in clips + user-uploadable custom clips (stored as mxc://). Accessible from call controls bar. **[AUDIT REQUIRED]** Verify the Element Call integration exposes the mic MediaStream for mixing. This is the highest-risk part of this feature. +**🔱 [EC-FORK]** Owning the EC source (see [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md)) would unblock real audio-injection — a proper soundboard mixed into the call — which is impossible against the prebuilt bundle today. **Complexity:** High. --- @@ -281,6 +282,7 @@ Features: **What:** High-end background noise cancellation using a pre-trained ML model (RNNoise) running in the browser. Removes dogs, fans, and keyboard clicks from the mic stream. **Shipped:** 3-tier setting (Off / Browser-native / ML) in Settings → General → Calls. ML tier injects a same-origin pre-init shim into the vendored Element Call `index.html` that monkeypatches `getUserMedia` and routes the captured mic through an RNNoise `AudioWorklet` before LiveKit publishes — no EC fork required. See LOTUS_FEATURES.md → "Noise Suppression (Advanced Multi-Tier)". +**🔱 [EC-FORK]** Once we own the EC source (see [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md)), denoise should become a first-class audio stage **inside** EC instead of an `index.html` getUserMedia monkeypatch — more robust, survives reconnects (fixes the A7 mic-after-reconnect bug), and removes the build-time injection hack. **Key decision:** LiveKit's Krisp filter is LiveKit-Cloud-only (we self-host the SFU); EC's own RNNoise PR #3892 is unmerged. The shim is the same post-capture pipeline #3892 uses, executed from the realm we control, so it survives EC version bumps. **AEC note (resolved-as-accepted):** WebAudio capture routing can weaken browser AEC — same tradeoff as EC's upstream feature; mitigated by keeping `echoCancellation`/`autoGainControl` on the raw capture and labeling the tier "beta". @@ -531,6 +533,8 @@ Exhaustive, low-level implementation details for backlog items. Follow these pat - Pass the destination's `.stream` to the call bridge. > ⚠️ **[Gemini_Found — CORRECTED]** Gemini originally suggested using LiveKit's `LocalAudioTrack.replaceTrack()` to mix audio into the call stream. This is **not possible** from Lotus Chat's realm: Element Call runs in a **cross-origin iframe** controlled via `matrix-widget-api` (postMessage). LiveKit's JS SDK and its `LocalAudioTrack` live inside EC's sandboxed context — inaccessible from our code. This directly contradicts the confirmed constraint already listed in the Server Capabilities table: _"Cindy CANNOT inject audio into EC call stream — In-call soundboard must be redesigned as local-only."_ The soundboard must be a local-playback-only feature (output through the user's speakers, not mixed into the call audio stream). +> +> 🔱 **[EC-FORK — partial correction]** The "cross-origin" claim above is **outdated**: EC is now **same-origin** / self-hosted (`iframe.sandbox` has `allow-same-origin`; we read `contentDocument`). The _practical_ blocker still holds — LiveKit's `LocalAudioTrack` lives in EC's **module scope** (not on `window`), so it's unreachable from cinny even same-origin. **Owning the EC source** (see [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md)) is the path to a real call-audio-inject API, which would unblock a true in-call soundboard. --- diff --git a/README.md b/README.md index e4168600f..1218adc22 100644 --- a/README.md +++ b/README.md @@ -144,6 +144,23 @@ The source code lives in `/root/code/cinny`. All changes should be made on the ` See [LOTUS_FEATURES.md](LOTUS_FEATURES.md) for the full feature changelog and [LOTUS_TODO.md](LOTUS_TODO.md) for the work backlog. +### 🔱 Planned: Element Call fork ("Lotus Call") + +Voice/video channels embed **Element Call**. Today it's a **pre-built npm bundle** +(`@element-hq/element-call-embedded` 0.20.1) copied to `public/element-call/` and +served same-origin; we steer it via the `matrix-widget-api` plus fragile DOM +hacks. Because we don't own its compiled source, several in-call issues (avatar +decorations on tiles, camera focus/fullscreen during screenshare, mic recovery +after reconnect, native theming, real call-audio injection) are unfixable from +outside. + +**The plan is to fork `element-hq/element-call` into a new `LotusGuild/element-call` +repo, build it from source, and host our own build** for true ownership. The full +self-contained plan and integration map — written for a fresh session with no +prior context — is in **[`HANDOFF_ELEMENT_CALL_FORK.md`](HANDOFF_ELEMENT_CALL_FORK.md)**. +Infra/hosting notes also live in the `LotusGuild/matrix` repo README. Search the +docs for the **`[EC-FORK]`** tag to find every related note. + ### Build ```bash