Files
cinny/LOTUS_BUGS.md
T

167 lines
15 KiB
Markdown
Raw Normal View History

# Lotus Chat — Open Bugs & Technical Debt
**Only OPEN and awaiting-verification items live here.** Resolved findings
(fixed-and-verified, false-positives, won't-fix) have been removed to keep this
actionable — the full history is in git. Items fixed in code but not yet
verified in a real environment are in **Needs Verification** below and have
step-by-step checks in [`LOTUS_TESTING.md`](./LOTUS_TESTING.md).
> Design rules for any fix here: follow the **Native-Cinny Law** and **TDS
> Design Law** in [`LOTUS_TODO.md`](./LOTUS_TODO.md).
---
## ⚠️ Needs Verification — fixed in code, awaiting live testing
Implemented and gate-green; confirm each per `LOTUS_TESTING.md`, then delete the row.
| ID | Item | File / area | Test |
| :--- | :------------------------------------------------------------------------------------- | :--------------------------------------------------- | :-------------------------------------------------------------------------------- |
| #2 | Chat-background animation flicker (`contain:paint`) | `lotus/chatBackground.ts` | F1 |
| #4 | Ringtone re-fixes: classic loudness + caller decline notice (A2 ✓ live) | `CallEmbedProvider.tsx`, `ringtones.ts` | A1,A3,A4 |
| #6 | Background vs. seasonal theme mutual exclusion | `state/settings.ts`, `General.tsx` | F2 |
| #7 | Composer toolbar touch targets (≥44px) | `room/RoomInput.tsx` | E1 |
| #8 | Room Settings horizontal overflow (mobile) | `components/page/style.css.ts` | E2 |
| #9 | Modal fullscreen on mobile (`useModalStyle`) | 22+ modal files | E3 |
| #10 | Composer not hidden by keyboard (`100dvh`) | `src/index.css` | E4 |
| #12 | PiP "All muted" badge re-fixed (was firing on any single mute) | `hooks/useCallSpeakers.ts` | G1 |
| N96 | Call-recovery overlay single "Back" button | `call/CallView.tsx` | A7 |
| N95 | AFK-monitor mic released on mute (OS indicator clears) | `hooks/useAfkAutoMute.ts` | L1 |
| N108 | Maskable PWA icons (Android adaptive) | `public/manifest.json` + `res/android/maskable-*` | L2 |
| EC | EC iframe load watchdog + self-heal + recovery UI | `plugins/call/CallEmbed.ts`, `CallView.tsx` | A7 |
| N105 | Notification clicks work after tab close (SW `notificationclick` + `showNotification`) | `sw.ts`, `utils/dom.ts`, `ClientNonUIFeatures.tsx` | get a msg notif, close the tab, click it → app focuses/opens + routes to the room |
| Gal | MediaGallery lazy-decrypt (true virtualization deferred) | `room/MediaGallery.tsx` | H1 |
| a11y | aria-labels: edit-history / reaction / thread / reply | `message/*` (`FallbackContent`, `Reaction`, `Reply`) | I |
**Verified working in live testing (2026-06):** A2, B1B4, C1, C3, D (mic/camera/deafen/screenshare/fullscreen/more-menu/PiP). Denoise quality in D is still poor — tracked under the denoise project, not a regression.
---
## 🧩 Element Call source-level items — now actionable via the fork
> 🔱 **[EC-FORK]** **UPDATE 2026-06-29: the fork is live.** We now own and
> self-build Element Call (`LotusGuild/element-call` →
> `@lotusguild/element-call-embedded`, Phase 1 done & cinny wired). A5/A6/A7
> below are **no longer "won't fix"** — they are ordinary source changes. See
> [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md) §10 + the Phase
> 2 work list. (The iframe is **same-origin** / self-hosted; the old blocker was
> that we didn't own EC's compiled source — which we now do.)
The in-call participant grid is rendered **inside EC's app**. Previously a
pre-built npm bundle we could only style/place around; now editable source.
Items from testing, with their fork-level fix path:
- **A5 — "Focus camera":** EC supports native tile-pinning. Our bottom-bar "Focus
camera" is a programmatic wrapper that **`.click()`s the tile** today
(`CallControl.ts` `focusCameraParticipant`), and during a screenshare EC
spotlights the shared screen so a camera pin may not override it. **Fork fix:**
add an `io.lotus.focus_participant` widget action that pins a participant in
EC's layout (coexisting with / overriding the screenshare spotlight); cinny
sends it via the widget API and the DOM-click hack is deleted. _Status: Open —
Actionable (Phase 2)._
- **A6 — avatar decorations in-call:** decorations render on **our** pre-join
lobby roster (`CallMemberCard`) but not on EC's in-call video tiles. **Fork
fix:** render the decoration APNG inside EC's participant-tile component, fed
decoration slugs via widget member data. _Status: Open — Actionable (Phase 2)._
- **A7 — mic dead after EC's "Reconnect":** the mid-call "Connection lost /
Reconnect" screen is **EC's own** (our load watchdog only covers an initial
hung load). After EC reconnects, the mic isn't re-published through our denoise
`getUserMedia` shim until a clean End+rejoin. **Fork fix:** move denoise into
EC's mic-capture/publish pipeline as a first-class audio stage — EC re-runs it
on every (re)publish, so reconnects keep denoise alive natively, and the
build-time `index.html` injection is removed. _Status: Open — Actionable
(Phase 2); root cause is the `getUserMedia` monkeypatch, not EC itself._
---
## 🔴 Open — Actionable
2026-06-17 20:37:54 -04:00
### Calls / Audio
- **N127 — ML denoise shim is never injected in `vite dev`.** The `lotusDenoise` plugin injects only on `closeBundle` (build), so ML noise suppression is silently inactive during local dev. Add a dev-mode injection (`configureServer` / `transformIndexHtml`). Dev-only impact. _Note: this **dissolves entirely** once denoise moves in-source in the fork (A7 fix) — there is then no build-time injection to be missing in dev._
2026-06-17 20:37:54 -04:00
### 🧨 Encryption / E2EE — ⚠️ EXTREME COMPLEXITY · 🧠 PLANNING SESSION REQUIRED · 👤 SENIOR ENGINEER
> **Observed live in prod 2026-06-30** on `chat.lotusguild.org` during a 2-person
> **Element Call** (E2EE enabled). These span **client rust-crypto (via
> `matrix-js-sdk@41.6.0-rc.0`) ↔ Synapse ↔ Element Call's MatrixRTC E2EE** and are
> very likely **interrelated** (see KE-1 → KE-2). Do **not** spot-fix — they need
> a dedicated cross-system planning session with the homeserver owner. Capture
> full client console + a synapse-side trace for the same call before starting.
> **None of these are caused by the EC fork work** (the issues reproduce on the
> old build; the local mic/denoise path is unrelated to key distribution).
- **KE-1 — One-time-key (OTK) upload conflict storm (CRITICAL, root-cause candidate).**
`POST /_matrix/client/v3/keys/upload` returns `400 M_UNKNOWN: One time key
signed_curve25519:AAAAAAAAAGQ already exists. Old key: {…} new key: {…}`
firing **continuously** (many/sec). The client repeatedly tries to publish an
OTK at a key id the server already holds **with a different value**, i.e. the
rust-crypto key store and Synapse have **diverged OTK state**. Impact: floods
the crypto outgoing-request loop and is the prime suspect for the downstream
missing-key failures (no fresh OTKs ⇒ no new Olm sessions ⇒ undecryptable
to-device key events). _Investigate:_ device/key-store reset-or-restore
mismatch, OTK id-counter desync, RC-SDK (`41.6.0-rc.0`) regression, or a
Synapse OTK bug. Repro signature: grep console for `already exists`.
**Extreme — planning session.**
- **KE-2 — Element Call media keys not arriving/decrypting → audio & video cut out (CRITICAL).**
`MissingKey: missing key at index N for participant @user`, `skipping decryption
due to missing key`, `MissingKey: key set not found for @user at index 0`, and
rust-crypto `WARN … Received an unexpected encrypted to-device event …
event_type="io.element.call.encryption_keys"`. EC distributes per-participant
media keys as **encrypted to-device `io.element.call.encryption_keys`** events;
these aren't being received/decrypted in order, so remote LiveKit audio/video
can't be decrypted — **this is the "friend's audio cuts out occasionally"
symptom.** Almost certainly downstream of **KE-1** (broken Olm sessions). Spans
EC's MatrixRTC E2EE + rust-crypto to-device + Synapse. **Extreme — planning
session.**
- **KE-3 — Timeline decryption error: missing `algorithm` field (HIGH).**
`Error decrypting event (… type=m.room.encrypted …): DecryptionError[msg:
missing field 'algorithm' at line 1 column 138 …]`. A malformed/legacy
encrypted event (or a serialization mismatch in the RC SDK) that rust-crypto
can't parse. Lower frequency than KE-1/2 but a distinct decode-path failure —
capture the offending event id (`$SASBBzoqj…` seen) and inspect its raw content.
- **KE-4 — MatrixRTC delayed-event / membership timeouts (MEDIUM-HIGH, reliability).**
`[MembershipManager] Network local timeout error while sending event, immediate
retry … AbortError: Restart delayed event timed out before the HS responded`,
with repeated `org.matrix.msc4157.update_delayed_event`. MSC4140/4157
delayed-event reliability against `matrix.lotusguild.org` — can cause stale/ghost
call membership and missed leave events. May be partly **homeserver
responsiveness**; correlate with synapse latency/load. Include in the same
planning session since it shares the call-reliability + HS-interaction surface.
### Security & Privacy
- **N97 — Access token stored in plaintext `localStorage`** (`state/sessions.ts`), vulnerable to XSS; device ID likewise. Architectural — needs a token-protection / session-storage redesign.
- **Session writes are non-atomic and not cross-tab synced** (`state/sessions.ts`) — risks inconsistent state / races across tabs.
- **Persisted PII without encryption:** user status message + expiry (`settings/account/Profile.tsx`), unsent composer drafts (`room/RoomInput.tsx`). Leak risk on shared devices.
2026-06-17 20:37:54 -04:00
### PWA / Offline / Notifications
- **N107 — SW has no `push` handler** — Web Push delivery is entirely non-functional. Needs a `push` listener + a Matrix push-gateway integration.
- **No app-asset caching strategy** (`src/sw.ts`) — no offline capability.
- ~~**`manifest: false`** may block PWA install~~ — **verified OK (2026-06):** `index.html` links `/manifest.json`, which exists in `public/` and is copied to `dist/`; VitePWA intentionally doesn't generate one. Not a bug.
2026-06-17 20:37:54 -04:00
### Dependencies & Build
- **`matrix-js-sdk` pinned to a Release Candidate** (`41.6.0-rc.0`); `@atlaskit` and build tools (`vite`, `typescript`, `eslint`) on unstable/experimental pins — review for stable versions; RC SDK is a tree-shaking/bundle-size risk.
- **Build-time overhead:** `lotusDenoise` does heavy sequential `fs` work in `closeBundle`; `viteStaticCopy` config is complex with redundant renames — could be streamlined.
2026-06-17 20:37:54 -04:00
### Code Hygiene / DevEx
- **Automated test suite — 231 tests across ~26 modules, a hard CI gate.** `npm test` runs Node's built-in runner via `tsx` (not vitest — Vite 8 is ahead of vitest's range) and **blocks the build job on failure**. Broad pure-logic coverage: utils (common, regex, sanitize/XSS, time, matrix, matrix-uia, mimeTypes, sort, accentColor, findAndReplace, AsyncSearch, ASCIILexicalTable, keyboard, room, matrix-crypto, featureCheck), state (settings, sessions, recentSearches, upload, typingMembers), plugins (matrix-to, call/utils, markdown/utils), lotus/avatarDecorations, search filters. Prevention work has caught + fixed **2 real bugs** (`findAndReplace` infinite-loop; `getSettings` crash-on-load when storage is blocked). **Next:** component/integration tests (the untestable-under-tsx DOM/React surface).
- **Extensive `as any` casts** across `src/` — gradual typing cleanup.
- **`types/matrix/` mirrors SDK types** instead of importing them — drift risk.
- **Hardcoded CDN URL** should move to an env var (the decoration CDN is now single-sourced in `avatarDecorations.ts`, but the literal is still in-repo).
- **`patch-folds.mjs` edits `node_modules` directly** — consider `patch-package`.
- **Infra docs:** `contrib/nginx` lacks security headers (HSTS/CSP) + uses rewrites over `try_files`; `contrib/caddy` has a placeholder path. CI/CD (`prod-deploy.yml`): sequential deploy, aggressive 1-min Netlify timeout, `package-manager-cache: false`.
- **README:** keep the fork-sync version + logo path current. (`CONTRIBUTING.md` is intentionally left as upstream Cinny's — not a Lotus concern.)
- **Architecture notes (low priority):** deep `features/` + `hooks/` nesting, many small coupled hooks, possible dead CSS/components, `SpacingVariant` / `DropTarget` recipe simplification.
- **Git workflow (forward-looking):** keep commits scoped — past monolithic "fix all bugs" commits and inconsistent prefixes hurt `git bisect`.
2026-06-17 20:37:54 -04:00
### Big Projects
- **#5 — Seasonal themes & chat-background redesign.** Current backgrounds are basic CSS; goal is high-fidelity, research-backed, GPU-accelerated designs (layered `oklch`, `backdrop-filter`, `contain:paint`) with WCAG-AA overlay contrast. Treat each as its own design sprint.