docs: log E2EE key-sync issues (KE-1..4) + tester checklist
CI / Build & Quality Checks (push) Failing after 12m1s
CI / Trigger Desktop Build (push) Has been skipped

LOTUS_BUGS.md: new Encryption/E2EE section tagged EXTREME complexity +
planning-session-required for a senior-engineer deep dive — OTK upload
conflict storm (KE-1), Element Call media-key distribution failures causing
audio/video dropouts (KE-2), a timeline decryption error (KE-3), and
MatrixRTC delayed-event timeouts (KE-4). All observed live 2026-06-30; not
caused by the EC fork work. Plus a non-developer ELEMENT_CALL_TEST_CHECKLIST.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
This commit is contained in:
2026-06-30 17:37:01 -04:00
parent efcee88f05
commit 84ce9843ff
2 changed files with 174 additions and 0 deletions
+122
View File
@@ -0,0 +1,122 @@
# Voice/Video Call — Testing Checklist 🎧
Thanks for helping test! We just upgraded the voice/video call system. Please run
through the checks below and tell us what happened.
**What you need:**
- 2 people (you + a friend), each on their own device, in the same call. A few
checks need one of you to have a **camera** and to **share your screen**.
- About 1520 minutes.
**How to report:** for each item just say ✅ (worked) or ❌ (didn't), and for any
❌ tell us what you saw. If something looks broken, a screenshot helps a lot.
---
## ⭐ Most important — please do these first
### 1. Your microphone keeps working after a connection hiccup
This is the biggest thing we changed, so test it carefully.
1. Join a call with your friend and talk for a few seconds (make sure they hear you).
2. Now **turn off your WiFi / internet for about 10 seconds**, then turn it back on.
(The call will show a "Connection lost / reconnecting" message — that's expected.)
3. Once it reconnects, **start talking again.**
-**Good if:** your friend can still hear you normally after it reconnects, without
you having to leave and rejoin the call.
-**Tell us if:** your friend can't hear you after reconnecting, or your voice
sounds broken/robotic/muffled, until you leave and rejoin.
### 2. Microphone quality / noise removal sounds normal
1. In a call, just talk normally for a bit.
2. If there's background noise (fan, typing, TV), notice whether it's reduced.
-**Good if:** your voice is clear and there's no silence, echo, or robotic warble.
-**Tell us if:** there are dropouts, echo, a "underwater"/metallic sound, or your
mic is silent even though you're talking.
### 3. Switching your microphone mid-call
1. While in a call, open call **Settings** and change your microphone to a
different one (e.g. headset ↔ built-in), then back.
2. Talk after each switch.
-**Good if:** your friend keeps hearing you after each switch.
-**Tell us if:** your audio cuts out or doesn't come back after switching.
### 4. All the call buttons still work
Go down the call control bar and tap each one, checking it actually does the thing:
- [ ] **Mute / unmute mic** (icon changes AND your friend stops/starts hearing you)
- [ ] **Camera on / off**
- [ ] **Deafen / sound** toggle (you stop/start hearing others)
- [ ] **Share screen** start and stop (including the "Share your screen?" prompt)
- [ ] **Full screen** on and off
- [ ] **"More" (⋮) menu** → the **Reactions**, **Settings**, and **Grid/Spotlight**
options each open the right thing
- [ ] **Leave / End call** — leaves cleanly
-**Tell us if:** any button does nothing when you tap it (tell us which one).
---
## 👀 Please also check these
### 5. The "who's talking" highlight points at the right person
1. In a call, have your friend talk, then you talk.
-**Good if:** the highlight / glow appears around the person who is actually
talking (and the right person, not someone else).
-**Tell us if:** the wrong person lights up, or nobody lights up when talking.
### 6. Mute badges show on the right person
1. Have your friend mute their mic.
-**Good if:** any "muted" indicator shows next to the person who is muted.
-**Tell us if:** it shows on the wrong person or doesn't update.
### 7. Focus a camera while someone is sharing their screen
_(Needs: one person sharing screen, another with camera on.)_
1. Person A **shares their screen.**
2. Person B turns their **camera on.**
3. Use the **"Focus camera"** option (from a participant's menu) on Person B.
-**Good if:** Person B's camera becomes the highlighted/spotlighted view
**alongside or over** the shared screen.
-**Tell us if:** nothing happens, or it throws you out of the screen share, or
you get an error.
### 8. Avatar decorations show on call tiles
_(Needs: someone in the call has an avatar decoration set in Settings → Profile.)_
1. Have a person with a **profile decoration** join with their **camera off** (so
their avatar/picture shows instead of video).
-**Good if:** their decoration (the frame/ring/effect around their picture)
shows on their tile **inside the call**, like it does elsewhere in the app.
-**Tell us if:** the decoration is missing, cut off, or in the wrong place.
### 9. The call screen looks right
1. Just look at the overall call screen.
-**Good if:** backgrounds, colors, and layout look normal — nothing is a weird
black box, see-through in a bad way, or overlapping.
-**Tell us if:** anything looks visually broken or out of place.
---
## 🙏 Thank you!
If a call ever sounds bad for **everyone** (not just you), let us know right away —
that's the one we most want to hear about quickly, and we can switch back fast.
+52
View File
@@ -80,6 +80,58 @@ Items from testing, with their fork-level fix path:
- **N127 — ML denoise shim is never injected in `vite dev`.** The `lotusDenoise` plugin injects only on `closeBundle` (build), so ML noise suppression is silently inactive during local dev. Add a dev-mode injection (`configureServer` / `transformIndexHtml`). Dev-only impact. _Note: this **dissolves entirely** once denoise moves in-source in the fork (A7 fix) — there is then no build-time injection to be missing in dev._
### 🧨 Encryption / E2EE — ⚠️ EXTREME COMPLEXITY · 🧠 PLANNING SESSION REQUIRED · 👤 SENIOR ENGINEER
> **Observed live in prod 2026-06-30** on `chat.lotusguild.org` during a 2-person
> **Element Call** (E2EE enabled). These span **client rust-crypto (via
> `matrix-js-sdk@41.6.0-rc.0`) ↔ Synapse ↔ Element Call's MatrixRTC E2EE** and are
> very likely **interrelated** (see KE-1 → KE-2). Do **not** spot-fix — they need
> a dedicated cross-system planning session with the homeserver owner. Capture
> full client console + a synapse-side trace for the same call before starting.
> **None of these are caused by the EC fork work** (the issues reproduce on the
> old build; the local mic/denoise path is unrelated to key distribution).
- **KE-1 — One-time-key (OTK) upload conflict storm (CRITICAL, root-cause candidate).**
`POST /_matrix/client/v3/keys/upload` returns `400 M_UNKNOWN: One time key
signed_curve25519:AAAAAAAAAGQ already exists. Old key: {…} new key: {…}`
firing **continuously** (many/sec). The client repeatedly tries to publish an
OTK at a key id the server already holds **with a different value**, i.e. the
rust-crypto key store and Synapse have **diverged OTK state**. Impact: floods
the crypto outgoing-request loop and is the prime suspect for the downstream
missing-key failures (no fresh OTKs ⇒ no new Olm sessions ⇒ undecryptable
to-device key events). _Investigate:_ device/key-store reset-or-restore
mismatch, OTK id-counter desync, RC-SDK (`41.6.0-rc.0`) regression, or a
Synapse OTK bug. Repro signature: grep console for `already exists`.
**Extreme — planning session.**
- **KE-2 — Element Call media keys not arriving/decrypting → audio & video cut out (CRITICAL).**
`MissingKey: missing key at index N for participant @user`, `skipping decryption
due to missing key`, `MissingKey: key set not found for @user at index 0`, and
rust-crypto `WARN … Received an unexpected encrypted to-device event …
event_type="io.element.call.encryption_keys"`. EC distributes per-participant
media keys as **encrypted to-device `io.element.call.encryption_keys`** events;
these aren't being received/decrypted in order, so remote LiveKit audio/video
can't be decrypted — **this is the "friend's audio cuts out occasionally"
symptom.** Almost certainly downstream of **KE-1** (broken Olm sessions). Spans
EC's MatrixRTC E2EE + rust-crypto to-device + Synapse. **Extreme — planning
session.**
- **KE-3 — Timeline decryption error: missing `algorithm` field (HIGH).**
`Error decrypting event (… type=m.room.encrypted …): DecryptionError[msg:
missing field 'algorithm' at line 1 column 138 …]`. A malformed/legacy
encrypted event (or a serialization mismatch in the RC SDK) that rust-crypto
can't parse. Lower frequency than KE-1/2 but a distinct decode-path failure —
capture the offending event id (`$SASBBzoqj…` seen) and inspect its raw content.
- **KE-4 — MatrixRTC delayed-event / membership timeouts (MEDIUM-HIGH, reliability).**
`[MembershipManager] Network local timeout error while sending event, immediate
retry … AbortError: Restart delayed event timed out before the HS responded`,
with repeated `org.matrix.msc4157.update_delayed_event`. MSC4140/4157
delayed-event reliability against `matrix.lotusguild.org` — can cause stale/ghost
call membership and missed leave events. May be partly **homeserver
responsiveness**; correlate with synapse latency/load. Include in the same
planning session since it shares the call-reliability + HS-interaction surface.
### Security & Privacy
- **N97 — Access token stored in plaintext `localStorage`** (`state/sessions.ts`), vulnerable to XSS; device ID likewise. Architectural — needs a token-protection / session-storage redesign.