Files
cinny/LOTUS_BUGS.md
T
jared f589182709
CI / Build & Quality Checks (push) Successful in 10m57s
CI / Trigger Desktop Build (push) Successful in 7s
docs: deep-audit wave dispositions in LOTUS_BUGS
Dep triage recorded (zero shipped exposure; SDK now 41.7.0 stable; dompurify
removed); Needs Verification rows for the audit-wave fixes (scheduled-cancel,
emoji lazy-load, SW precache, desktop CSP smoke).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-07-02 00:19:50 -04:00

171 lines
17 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters
This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
# Lotus Chat — Open Bugs & Technical Debt
**Only OPEN and awaiting-verification items live here.** Resolved findings
(fixed-and-verified, false-positives, won't-fix) have been removed to keep this
actionable — the full history is in git. Items fixed in code but not yet
verified in a real environment are in **Needs Verification** below and have
step-by-step checks in [`LOTUS_TESTING.md`](./LOTUS_TESTING.md).
> Design rules for any fix here: follow the **Native-Cinny Law** and **TDS
> Design Law** in [`LOTUS_TODO.md`](./LOTUS_TODO.md).
---
## ⚠️ Needs Verification — fixed in code, awaiting live testing
Implemented and gate-green; confirm each per `LOTUS_TESTING.md`, then delete the row.
| ID | Item | File / area | Test |
| :--- | :------------------------------------------------------------------------------------- | :--------------------------------------------------- | :-------------------------------------------------------------------------------- |
| #2 | Chat-background animation flicker (`contain:paint`) | `lotus/chatBackground.ts` | F1 |
| #4 | Ringtone re-fixes: classic loudness + caller decline notice (A2 ✓ live) | `CallEmbedProvider.tsx`, `ringtones.ts` | A1,A3,A4 |
| #6 | Background vs. seasonal theme mutual exclusion | `state/settings.ts`, `General.tsx` | F2 |
| #7 | Composer toolbar touch targets (≥44px) | `room/RoomInput.tsx` | E1 |
| #8 | Room Settings horizontal overflow (mobile) | `components/page/style.css.ts` | E2 |
| #9 | Modal fullscreen on mobile (`useModalStyle`) | 22+ modal files | E3 |
| #10 | Composer not hidden by keyboard (`100dvh`) | `src/index.css` | E4 |
| #12 | PiP "All muted" badge re-fixed (was firing on any single mute) | `hooks/useCallSpeakers.ts` | G1 |
| N96 | Call-recovery overlay single "Back" button | `call/CallView.tsx` | A7 |
| N95 | AFK-monitor mic released on mute (OS indicator clears) | `hooks/useAfkAutoMute.ts` | L1 |
| N108 | Maskable PWA icons (Android adaptive) | `public/manifest.json` + `res/android/maskable-*` | L2 |
| EC | EC iframe load watchdog + self-heal + recovery UI | `plugins/call/CallEmbed.ts`, `CallView.tsx` | A7 |
| N105 | Notification clicks work after tab close (SW `notificationclick` + `showNotification`) | `sw.ts`, `utils/dom.ts`, `ClientNonUIFeatures.tsx` | get a msg notif, close the tab, click it → app focuses/opens + routes to the room |
| Gal | MediaGallery lazy-decrypt (true virtualization deferred) | `room/MediaGallery.tsx` | H1 |
| a11y | aria-labels: edit-history / reaction / thread / reply | `message/*` (`FallbackContent`, `Reaction`, `Reply`) | I |
| P3-8 | Thread Panel (side drawer, chips, threaded receipts, thread composer) | `features/room/thread/*`, `RoomTimeline/RoomInput` | 6-step checklist in LOTUS_TODO §P3-8 |
| P4-4 | KaTeX math (`$…$`, `$$…$$`, data-mx-maths; lazy chunk) | `utils/mathParse.ts`, `components/math/` | send `$x^2$`, `$$\int f$$`, `$5 and $10` (stays text), math inside code block (stays text) |
| P4-8 | Encrypted-search cache (opt-in toggle, clear button, logout wipe) | `utils/searchCache.ts`, message-search | enable in search panel → search → reload → coverage persists; logout wipes |
| N97a | Session blob migration + cross-tab logout sync | `state/sessions.ts`, `useSessionSync` | login on old build → new build migrates; logout in tab A → tab B drops to auth |
| P4-1 | Slack-style thread notifications (participating default, All/Mentions/Mute, badge math) | `utils/threadNotifications.ts`, `ClientNonUIFeatures`, `roomToUnread` | 6-step checklist in LOTUS_TODO §P4-1 |
| AW-1 | Scheduled-message cancel no longer ghost-sends (error row on failure) | `ScheduledMessagesTray.tsx` | schedule → cancel with network cut → item stays + error; retry works |
| AW-2 | Emoji lazy-load (search/autocomplete/recents fill in; board opens fast) | `plugins/emoji.ts` + consumers | first emoji-board open of a session: grid+search populate; reactions still label |
| AW-3 | SW precache (repeat-visit near-instant; deploys still picked up immediately) | `sw.ts`, `vite.config.js` | load app twice (2nd = cached assets); deploy → reload picks new version |
| AW-4 | Desktop CSP tighten + Escape/panel fixes + thread Jump to Latest | `tauri.conf.json`, Room/ThreadPanel | desktop: boots, avatars/media load, VT323 font renders, location maps embed, calls connect, deep links work |
**Verified working in live testing (2026-06):** A2, B1B4, C1, C3, D (mic/camera/deafen/screenshare/fullscreen/more-menu/PiP). Denoise quality in D is still poor — tracked under the denoise project, not a regression.
---
## 🧩 Element Call source-level items — now actionable via the fork
> 🔱 **[EC-FORK]** **UPDATE 2026-06-30: Phase 2 IMPLEMENTED.** We own and
> self-build Element Call (`LotusGuild/element-call` →
> `@lotusguild/element-call-embedded@0.20.1-lotus.1`, cinny wired). A5/A6/A7
> below are **fixed in the fork** — they are now ⚠️ awaiting **live
> verification** (`LOTUS_TESTING.md` §D2), not open work. See
> [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md) §10. Delete each
> row once verified live.
The in-call participant grid is rendered **inside EC's app** — now editable source
(previously a prebuilt npm bundle we could only style around). Status of the items
from testing:
- **A5 — "Focus camera": ⚠️ FIXED in fork, awaiting verify (D2-3).** cinny now
sends an `io.lotus.focus_participant` widget action that pins a participant in
EC's layout (coexisting with / overriding the screenshare spotlight); the old
`.click()`-the-tile DOM hack in `CallControl.ts` is deleted.
- **A6 — avatar decorations in-call: ⚠️ FIXED in fork, awaiting verify (D2-4).**
cinny pushes `io.lotus.decorations` (per-user APNG URLs) and the fork renders
them on EC's participant video-tile avatars — not just our pre-join lobby roster.
- **A7 — mic dead after EC's "Reconnect": ⚠️ FIXED in fork, awaiting verify
(D2-1).** Denoise moved into EC's mic-capture/publish pipeline as a first-class
LiveKit `TrackProcessor` (flag `lotusDenoiseSource=1`); EC re-runs it on every
(re)publish, so reconnects keep denoise alive natively. The build-time
`getUserMedia`/`index.html` injection (the root cause) is removed. **Highest
blast radius — everyone's mic; verify D2-1 carefully.**
---
## 🔴 Open — Actionable
### 🧨 Encryption / E2EE — ⚠️ EXTREME COMPLEXITY · 🧠 PLANNING SESSION REQUIRED · 👤 SENIOR ENGINEER
> 🧰 **Investigation kit ready (2026-07):** [`LOTUS_E2EE_INVESTIGATION.md`](./LOTUS_E2EE_INVESTIGATION.md)
> has the per-KE capture runbook (console signatures, synapse-side queries, the
> KE-1→KE-2 causality decision tree, ranked remediations), and the client now
> ships a **Crypto Diagnostics** capture helper (Settings) — run it during the
> next affected call and download the report before starting any fix.
> **Observed live in prod 2026-06-30** on `chat.lotusguild.org` during a 2-person
> **Element Call** (E2EE enabled). These span **client rust-crypto (via
> `matrix-js-sdk@41.6.0-rc.0`) ↔ Synapse ↔ Element Call's MatrixRTC E2EE** and are
> very likely **interrelated** (see KE-1 → KE-2). Do **not** spot-fix — they need
> a dedicated cross-system planning session with the homeserver owner. Capture
> full client console + a synapse-side trace for the same call before starting.
> **None of these are caused by the EC fork work** (the issues reproduce on the
> old build; the local mic/denoise path is unrelated to key distribution).
- **KE-1 — One-time-key (OTK) upload conflict storm (CRITICAL, root-cause candidate).**
`POST /_matrix/client/v3/keys/upload` returns `400 M_UNKNOWN: One time key
signed_curve25519:AAAAAAAAAGQ already exists. Old key: {…} new key: {…}`
firing **continuously** (many/sec). The client repeatedly tries to publish an
OTK at a key id the server already holds **with a different value**, i.e. the
rust-crypto key store and Synapse have **diverged OTK state**. Impact: floods
the crypto outgoing-request loop and is the prime suspect for the downstream
missing-key failures (no fresh OTKs ⇒ no new Olm sessions ⇒ undecryptable
to-device key events). _Investigate:_ device/key-store reset-or-restore
mismatch, OTK id-counter desync, RC-SDK (`41.6.0-rc.0`) regression, or a
Synapse OTK bug. Repro signature: grep console for `already exists`.
**Extreme — planning session.**
- **KE-2 — Element Call media keys not arriving/decrypting → audio & video cut out (CRITICAL).**
`MissingKey: missing key at index N for participant @user`, `skipping decryption
due to missing key`, `MissingKey: key set not found for @user at index 0`, and
rust-crypto `WARN … Received an unexpected encrypted to-device event …
event_type="io.element.call.encryption_keys"`. EC distributes per-participant
media keys as **encrypted to-device `io.element.call.encryption_keys`** events;
these aren't being received/decrypted in order, so remote LiveKit audio/video
can't be decrypted — **this is the "friend's audio cuts out occasionally"
symptom.** Almost certainly downstream of **KE-1** (broken Olm sessions). Spans
EC's MatrixRTC E2EE + rust-crypto to-device + Synapse. **Extreme — planning
session.**
- **KE-3 — Timeline decryption error: missing `algorithm` field (HIGH).**
`Error decrypting event (… type=m.room.encrypted …): DecryptionError[msg:
missing field 'algorithm' at line 1 column 138 …]`. A malformed/legacy
encrypted event (or a serialization mismatch in the RC SDK) that rust-crypto
can't parse. Lower frequency than KE-1/2 but a distinct decode-path failure —
capture the offending event id (`$SASBBzoqj…` seen) and inspect its raw content.
- **KE-4 — MatrixRTC delayed-event / membership timeouts (MEDIUM-HIGH, reliability).**
`[MembershipManager] Network local timeout error while sending event, immediate
retry … AbortError: Restart delayed event timed out before the HS responded`,
with repeated `org.matrix.msc4157.update_delayed_event`. MSC4140/4157
delayed-event reliability against `matrix.lotusguild.org` — can cause stale/ghost
call membership and missed leave events. May be partly **homeserver
responsiveness**; correlate with synapse latency/load. Include in the same
planning session since it shares the call-reliability + HS-interaction surface.
### Security & Privacy
- **N97 — Access token stored in plaintext `localStorage`** (`state/sessions.ts`), vulnerable to XSS; device ID likewise. Architectural — needs a token-protection / session-storage redesign.
- **Session writes are non-atomic and not cross-tab synced** (`state/sessions.ts`) — risks inconsistent state / races across tabs.
- **Persisted PII without encryption:** user status message + expiry (`settings/account/Profile.tsx`), unsent composer drafts (`room/RoomInput.tsx`). Leak risk on shared devices.
### PWA / Offline / Notifications
- **N107 — SW has no `push` handler** — Web Push delivery is entirely non-functional. Needs a `push` listener + a Matrix push-gateway integration.
- **No app-asset caching strategy** (`src/sw.ts`) — no offline capability.
- ~~**`manifest: false`** may block PWA install~~ — **verified OK (2026-06):** `index.html` links `/manifest.json`, which exists in `public/` and is copied to `dist/`; VitePWA intentionally doesn't generate one. Not a bug.
### Dependencies & Build
- ~~**`matrix-js-sdk` pinned to a Release Candidate**~~ — **done (2026-07):** moved to `41.7.0` stable (crypto-wasm 18.3.1 security bump). Deep-audit dep triage: all 16 npm advisories are dev-only/unreachable/dead-dep — zero shipped exposure; dead `dompurify` removed. `@atlaskit`/build-tool pins remain review-worthy but low priority.
- **Build-time overhead:** `lotusDenoise` does heavy sequential `fs` work in `closeBundle`; `viteStaticCopy` config is complex with redundant renames — could be streamlined.
### Code Hygiene / DevEx
- **Automated test suite — 561+ tests across 65+ modules, a hard CI gate.** `npm test` runs Node's built-in runner via `tsx` (not vitest — Vite 8 is ahead of vitest's range) and **blocks the build job on failure**. Broad pure-logic coverage: utils (common, regex, sanitize/XSS, time, matrix, matrix-uia, mimeTypes, sort, accentColor, findAndReplace, AsyncSearch, ASCIILexicalTable, keyboard, room, matrix-crypto, featureCheck, syntaxHighlight, imageCompression, user-agent, callSounds), state (settings, sessions, recentSearches, upload, typingMembers, lists, room-list, toast, scheduledMessages, backupRestore, callEmbed/callPreferences, spaceRooms, …), plugins (matrix-to, call/utils, via-servers, bad-words, recent-emoji, custom-emoji, markdown block/inline/utils), OIDC (cs-api, useParsedLoginFlows, oidcState), lotus/avatarDecorations, message-search, search filters. Prevention work has caught + fixed **4 real bugs** (`findAndReplace` infinite-loop; `getSettings` crash-on-load when storage is blocked; `isMacOS` never matching modern Macs; `isMLDenoiseSupported` throwing `ReferenceError` instead of returning false on browsers lacking the `AudioWorkletNode` binding). **Next:** component/integration tests (the untestable-under-tsx DOM/React surface).
- **Extensive `as any` casts** across `src/` — gradual typing cleanup.
- **`types/matrix/` mirrors SDK types** instead of importing them — drift risk.
- ~~**Hardcoded CDN URL** should move to an env var~~ — **done:** `avatarDecorations.ts` already honors a `VITE_DECORATION_CDN` env override (lines 14-16); the in-repo literal is only the default. Nothing left.
- **`patch-folds.mjs` edits `node_modules` directly** — consider `patch-package`.
- **Infra docs:** `contrib/nginx` lacks security headers (HSTS/CSP) + uses rewrites over `try_files`; `contrib/caddy` has a placeholder path. CI/CD (`prod-deploy.yml`): sequential deploy, aggressive 1-min Netlify timeout, `package-manager-cache: false`.
- **README:** keep the fork-sync version + logo path current. (`CONTRIBUTING.md` is intentionally left as upstream Cinny's — not a Lotus concern.)
- **Architecture notes (low priority):** deep `features/` + `hooks/` nesting, many small coupled hooks, possible dead CSS/components, `SpacingVariant` / `DropTarget` recipe simplification.
- **Git workflow (forward-looking):** keep commits scoped — past monolithic "fix all bugs" commits and inconsistent prefixes hurt `git bisect`.
### Big Projects
- ~~**#5 — Seasonal themes & chat-background redesign.**~~ **DONE (2026-06/07):** 11 seasonal/holiday overlays shipped and later toned down + given a settings preview grid; all 19 chat backgrounds redesigned (Carbon + Aurora kept per user preference), one design sprint each, GPU-friendly CSS with `prefers-reduced-motion` + pause toggle. Remaining polish rides normal bug flow, not a "big project."