Files
cinny/LOTUS_BUGS.md
T
jared 7c06b27c73
CI / Build & Quality Checks (push) Successful in 10m49s
CI / Trigger Desktop Build (push) Successful in 8s
feat(call): in-call soundboard, quality controls, room call-permissions
Element Call is now consumed as our self-built fork
(@lotusguild/element-call-embedded); wire up its previously-dormant
capabilities and document the fork as live.

Soundboard (P5-15): a call-bar button plays user-uploaded audio clips into the
call as a real published track (io.lotus.inject_audio) plus local playback.
Clips are uploadable like emoji/sticker packs, stored in io.lotus.soundboard
account data (synced across devices). Gated by a Settings toggle + volume.

Quality controls (P5-31): per-user mic/screenshare bitrate + screenshare
framerate (Settings -> Calls), applied via io.lotus.set_quality clamped to any
room cap. Room admins set caps and hard call-permissions (allow_screenshare /
allow_camera) in Room Settings -> Voice; the call bar hides blocked buttons.

- New: CallSoundboard, useSoundboard, soundboardClips; RoomQuality,
  useCallQuality, callQuality (+ unit tests).
- Optimistic-write RoomQuality admin UI (no stale-state clobber).
- Docs: mark EC fork live across README/FEATURES/TODO/BUGS/TESTING; add D2
  manual-test steps.

Numeric quality caps are client-cooperative; screenshare/camera permissions are
hard-enforced server-side (see LotusGuild/matrix voice-limit-guard).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-06-30 22:34:17 -04:00

160 lines
15 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Lotus Chat — Open Bugs & Technical Debt
**Only OPEN and awaiting-verification items live here.** Resolved findings
(fixed-and-verified, false-positives, won't-fix) have been removed to keep this
actionable — the full history is in git. Items fixed in code but not yet
verified in a real environment are in **Needs Verification** below and have
step-by-step checks in [`LOTUS_TESTING.md`](./LOTUS_TESTING.md).
> Design rules for any fix here: follow the **Native-Cinny Law** and **TDS
> Design Law** in [`LOTUS_TODO.md`](./LOTUS_TODO.md).
---
## ⚠️ Needs Verification — fixed in code, awaiting live testing
Implemented and gate-green; confirm each per `LOTUS_TESTING.md`, then delete the row.
| ID | Item | File / area | Test |
| :--- | :------------------------------------------------------------------------------------- | :--------------------------------------------------- | :-------------------------------------------------------------------------------- |
| #2 | Chat-background animation flicker (`contain:paint`) | `lotus/chatBackground.ts` | F1 |
| #4 | Ringtone re-fixes: classic loudness + caller decline notice (A2 ✓ live) | `CallEmbedProvider.tsx`, `ringtones.ts` | A1,A3,A4 |
| #6 | Background vs. seasonal theme mutual exclusion | `state/settings.ts`, `General.tsx` | F2 |
| #7 | Composer toolbar touch targets (≥44px) | `room/RoomInput.tsx` | E1 |
| #8 | Room Settings horizontal overflow (mobile) | `components/page/style.css.ts` | E2 |
| #9 | Modal fullscreen on mobile (`useModalStyle`) | 22+ modal files | E3 |
| #10 | Composer not hidden by keyboard (`100dvh`) | `src/index.css` | E4 |
| #12 | PiP "All muted" badge re-fixed (was firing on any single mute) | `hooks/useCallSpeakers.ts` | G1 |
| N96 | Call-recovery overlay single "Back" button | `call/CallView.tsx` | A7 |
| N95 | AFK-monitor mic released on mute (OS indicator clears) | `hooks/useAfkAutoMute.ts` | L1 |
| N108 | Maskable PWA icons (Android adaptive) | `public/manifest.json` + `res/android/maskable-*` | L2 |
| EC | EC iframe load watchdog + self-heal + recovery UI | `plugins/call/CallEmbed.ts`, `CallView.tsx` | A7 |
| N105 | Notification clicks work after tab close (SW `notificationclick` + `showNotification`) | `sw.ts`, `utils/dom.ts`, `ClientNonUIFeatures.tsx` | get a msg notif, close the tab, click it → app focuses/opens + routes to the room |
| Gal | MediaGallery lazy-decrypt (true virtualization deferred) | `room/MediaGallery.tsx` | H1 |
| a11y | aria-labels: edit-history / reaction / thread / reply | `message/*` (`FallbackContent`, `Reaction`, `Reply`) | I |
**Verified working in live testing (2026-06):** A2, B1B4, C1, C3, D (mic/camera/deafen/screenshare/fullscreen/more-menu/PiP). Denoise quality in D is still poor — tracked under the denoise project, not a regression.
---
## 🧩 Element Call source-level items — now actionable via the fork
> 🔱 **[EC-FORK]** **UPDATE 2026-06-30: Phase 2 IMPLEMENTED.** We own and
> self-build Element Call (`LotusGuild/element-call` →
> `@lotusguild/element-call-embedded@0.20.1-lotus.1`, cinny wired). A5/A6/A7
> below are **fixed in the fork** — they are now ⚠️ awaiting **live
> verification** (`LOTUS_TESTING.md` §D2), not open work. See
> [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md) §10. Delete each
> row once verified live.
The in-call participant grid is rendered **inside EC's app** — now editable source
(previously a prebuilt npm bundle we could only style around). Status of the items
from testing:
- **A5 — "Focus camera": ⚠️ FIXED in fork, awaiting verify (D2-3).** cinny now
sends an `io.lotus.focus_participant` widget action that pins a participant in
EC's layout (coexisting with / overriding the screenshare spotlight); the old
`.click()`-the-tile DOM hack in `CallControl.ts` is deleted.
- **A6 — avatar decorations in-call: ⚠️ FIXED in fork, awaiting verify (D2-4).**
cinny pushes `io.lotus.decorations` (per-user APNG URLs) and the fork renders
them on EC's participant video-tile avatars — not just our pre-join lobby roster.
- **A7 — mic dead after EC's "Reconnect": ⚠️ FIXED in fork, awaiting verify
(D2-1).** Denoise moved into EC's mic-capture/publish pipeline as a first-class
LiveKit `TrackProcessor` (flag `lotusDenoiseSource=1`); EC re-runs it on every
(re)publish, so reconnects keep denoise alive natively. The build-time
`getUserMedia`/`index.html` injection (the root cause) is removed. **Highest
blast radius — everyone's mic; verify D2-1 carefully.**
---
## 🔴 Open — Actionable
### Calls / Audio
- ~~**N127 — ML denoise shim is never injected in `vite dev`.**~~ **RESOLVED (dissolved by the A7 denoise cutover).** `vite.config.js` no longer injects a getUserMedia shim at all — the forked Element Call runs ML denoise in-source as a LiveKit `TrackProcessor` (activated by `lotusDenoiseSource=1`), so there is no build-time injection that could be missing in dev. Nothing to fix.
### 🧨 Encryption / E2EE — ⚠️ EXTREME COMPLEXITY · 🧠 PLANNING SESSION REQUIRED · 👤 SENIOR ENGINEER
> **Observed live in prod 2026-06-30** on `chat.lotusguild.org` during a 2-person
> **Element Call** (E2EE enabled). These span **client rust-crypto (via
> `matrix-js-sdk@41.6.0-rc.0`) ↔ Synapse ↔ Element Call's MatrixRTC E2EE** and are
> very likely **interrelated** (see KE-1 → KE-2). Do **not** spot-fix — they need
> a dedicated cross-system planning session with the homeserver owner. Capture
> full client console + a synapse-side trace for the same call before starting.
> **None of these are caused by the EC fork work** (the issues reproduce on the
> old build; the local mic/denoise path is unrelated to key distribution).
- **KE-1 — One-time-key (OTK) upload conflict storm (CRITICAL, root-cause candidate).**
`POST /_matrix/client/v3/keys/upload` returns `400 M_UNKNOWN: One time key
signed_curve25519:AAAAAAAAAGQ already exists. Old key: {…} new key: {…}`
firing **continuously** (many/sec). The client repeatedly tries to publish an
OTK at a key id the server already holds **with a different value**, i.e. the
rust-crypto key store and Synapse have **diverged OTK state**. Impact: floods
the crypto outgoing-request loop and is the prime suspect for the downstream
missing-key failures (no fresh OTKs ⇒ no new Olm sessions ⇒ undecryptable
to-device key events). _Investigate:_ device/key-store reset-or-restore
mismatch, OTK id-counter desync, RC-SDK (`41.6.0-rc.0`) regression, or a
Synapse OTK bug. Repro signature: grep console for `already exists`.
**Extreme — planning session.**
- **KE-2 — Element Call media keys not arriving/decrypting → audio & video cut out (CRITICAL).**
`MissingKey: missing key at index N for participant @user`, `skipping decryption
due to missing key`, `MissingKey: key set not found for @user at index 0`, and
rust-crypto `WARN … Received an unexpected encrypted to-device event …
event_type="io.element.call.encryption_keys"`. EC distributes per-participant
media keys as **encrypted to-device `io.element.call.encryption_keys`** events;
these aren't being received/decrypted in order, so remote LiveKit audio/video
can't be decrypted — **this is the "friend's audio cuts out occasionally"
symptom.** Almost certainly downstream of **KE-1** (broken Olm sessions). Spans
EC's MatrixRTC E2EE + rust-crypto to-device + Synapse. **Extreme — planning
session.**
- **KE-3 — Timeline decryption error: missing `algorithm` field (HIGH).**
`Error decrypting event (… type=m.room.encrypted …): DecryptionError[msg:
missing field 'algorithm' at line 1 column 138 …]`. A malformed/legacy
encrypted event (or a serialization mismatch in the RC SDK) that rust-crypto
can't parse. Lower frequency than KE-1/2 but a distinct decode-path failure —
capture the offending event id (`$SASBBzoqj…` seen) and inspect its raw content.
- **KE-4 — MatrixRTC delayed-event / membership timeouts (MEDIUM-HIGH, reliability).**
`[MembershipManager] Network local timeout error while sending event, immediate
retry … AbortError: Restart delayed event timed out before the HS responded`,
with repeated `org.matrix.msc4157.update_delayed_event`. MSC4140/4157
delayed-event reliability against `matrix.lotusguild.org` — can cause stale/ghost
call membership and missed leave events. May be partly **homeserver
responsiveness**; correlate with synapse latency/load. Include in the same
planning session since it shares the call-reliability + HS-interaction surface.
### Security & Privacy
- **N97 — Access token stored in plaintext `localStorage`** (`state/sessions.ts`), vulnerable to XSS; device ID likewise. Architectural — needs a token-protection / session-storage redesign.
- **Session writes are non-atomic and not cross-tab synced** (`state/sessions.ts`) — risks inconsistent state / races across tabs.
- **Persisted PII without encryption:** user status message + expiry (`settings/account/Profile.tsx`), unsent composer drafts (`room/RoomInput.tsx`). Leak risk on shared devices.
### PWA / Offline / Notifications
- **N107 — SW has no `push` handler** — Web Push delivery is entirely non-functional. Needs a `push` listener + a Matrix push-gateway integration.
- **No app-asset caching strategy** (`src/sw.ts`) — no offline capability.
- ~~**`manifest: false`** may block PWA install~~ — **verified OK (2026-06):** `index.html` links `/manifest.json`, which exists in `public/` and is copied to `dist/`; VitePWA intentionally doesn't generate one. Not a bug.
### Dependencies & Build
- **`matrix-js-sdk` pinned to a Release Candidate** (`41.6.0-rc.0`); `@atlaskit` and build tools (`vite`, `typescript`, `eslint`) on unstable/experimental pins — review for stable versions; RC SDK is a tree-shaking/bundle-size risk.
- **Build-time overhead:** `lotusDenoise` does heavy sequential `fs` work in `closeBundle`; `viteStaticCopy` config is complex with redundant renames — could be streamlined.
### Code Hygiene / DevEx
- **Automated test suite — 545 tests across 62 modules, a hard CI gate.** `npm test` runs Node's built-in runner via `tsx` (not vitest — Vite 8 is ahead of vitest's range) and **blocks the build job on failure**. Broad pure-logic coverage: utils (common, regex, sanitize/XSS, time, matrix, matrix-uia, mimeTypes, sort, accentColor, findAndReplace, AsyncSearch, ASCIILexicalTable, keyboard, room, matrix-crypto, featureCheck, syntaxHighlight, imageCompression, user-agent, callSounds), state (settings, sessions, recentSearches, upload, typingMembers, lists, room-list, toast, scheduledMessages, backupRestore, callEmbed/callPreferences, spaceRooms, …), plugins (matrix-to, call/utils, via-servers, bad-words, recent-emoji, custom-emoji, markdown block/inline/utils), OIDC (cs-api, useParsedLoginFlows, oidcState), lotus/avatarDecorations, message-search, search filters. Prevention work has caught + fixed **4 real bugs** (`findAndReplace` infinite-loop; `getSettings` crash-on-load when storage is blocked; `isMacOS` never matching modern Macs; `isMLDenoiseSupported` throwing `ReferenceError` instead of returning false on browsers lacking the `AudioWorkletNode` binding). **Next:** component/integration tests (the untestable-under-tsx DOM/React surface).
- **Extensive `as any` casts** across `src/` — gradual typing cleanup.
- **`types/matrix/` mirrors SDK types** instead of importing them — drift risk.
- **Hardcoded CDN URL** should move to an env var (the decoration CDN is now single-sourced in `avatarDecorations.ts`, but the literal is still in-repo).
- **`patch-folds.mjs` edits `node_modules` directly** — consider `patch-package`.
- **Infra docs:** `contrib/nginx` lacks security headers (HSTS/CSP) + uses rewrites over `try_files`; `contrib/caddy` has a placeholder path. CI/CD (`prod-deploy.yml`): sequential deploy, aggressive 1-min Netlify timeout, `package-manager-cache: false`.
- **README:** keep the fork-sync version + logo path current. (`CONTRIBUTING.md` is intentionally left as upstream Cinny's — not a Lotus concern.)
- **Architecture notes (low priority):** deep `features/` + `hooks/` nesting, many small coupled hooks, possible dead CSS/components, `SpacingVariant` / `DropTarget` recipe simplification.
- **Git workflow (forward-looking):** keep commits scoped — past monolithic "fix all bugs" commits and inconsistent prefixes hurt `git bisect`.
### Big Projects
- **#5 — Seasonal themes & chat-background redesign.** Current backgrounds are basic CSS; goal is high-fidelity, research-backed, GPU-accelerated designs (layered `oklch`, `backdrop-filter`, `contain:paint`) with WCAG-AA overlay contrast. Treat each as its own design sprint.