f589182709
Dep triage recorded (zero shipped exposure; SDK now 41.7.0 stable; dompurify removed); Needs Verification rows for the audit-wave fixes (scheduled-cancel, emoji lazy-load, SW precache, desktop CSP smoke). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
171 lines
17 KiB
Markdown
171 lines
17 KiB
Markdown
# Lotus Chat — Open Bugs & Technical Debt
|
||
|
||
**Only OPEN and awaiting-verification items live here.** Resolved findings
|
||
(fixed-and-verified, false-positives, won't-fix) have been removed to keep this
|
||
actionable — the full history is in git. Items fixed in code but not yet
|
||
verified in a real environment are in **Needs Verification** below and have
|
||
step-by-step checks in [`LOTUS_TESTING.md`](./LOTUS_TESTING.md).
|
||
|
||
> Design rules for any fix here: follow the **Native-Cinny Law** and **TDS
|
||
> Design Law** in [`LOTUS_TODO.md`](./LOTUS_TODO.md).
|
||
|
||
---
|
||
|
||
## ⚠️ Needs Verification — fixed in code, awaiting live testing
|
||
|
||
Implemented and gate-green; confirm each per `LOTUS_TESTING.md`, then delete the row.
|
||
|
||
| ID | Item | File / area | Test |
|
||
| :--- | :------------------------------------------------------------------------------------- | :--------------------------------------------------- | :-------------------------------------------------------------------------------- |
|
||
| #2 | Chat-background animation flicker (`contain:paint`) | `lotus/chatBackground.ts` | F1 |
|
||
| #4 | Ringtone re-fixes: classic loudness + caller decline notice (A2 ✓ live) | `CallEmbedProvider.tsx`, `ringtones.ts` | A1,A3,A4 |
|
||
| #6 | Background vs. seasonal theme mutual exclusion | `state/settings.ts`, `General.tsx` | F2 |
|
||
| #7 | Composer toolbar touch targets (≥44px) | `room/RoomInput.tsx` | E1 |
|
||
| #8 | Room Settings horizontal overflow (mobile) | `components/page/style.css.ts` | E2 |
|
||
| #9 | Modal fullscreen on mobile (`useModalStyle`) | 22+ modal files | E3 |
|
||
| #10 | Composer not hidden by keyboard (`100dvh`) | `src/index.css` | E4 |
|
||
| #12 | PiP "All muted" badge re-fixed (was firing on any single mute) | `hooks/useCallSpeakers.ts` | G1 |
|
||
| N96 | Call-recovery overlay single "Back" button | `call/CallView.tsx` | A7 |
|
||
| N95 | AFK-monitor mic released on mute (OS indicator clears) | `hooks/useAfkAutoMute.ts` | L1 |
|
||
| N108 | Maskable PWA icons (Android adaptive) | `public/manifest.json` + `res/android/maskable-*` | L2 |
|
||
| EC | EC iframe load watchdog + self-heal + recovery UI | `plugins/call/CallEmbed.ts`, `CallView.tsx` | A7 |
|
||
| N105 | Notification clicks work after tab close (SW `notificationclick` + `showNotification`) | `sw.ts`, `utils/dom.ts`, `ClientNonUIFeatures.tsx` | get a msg notif, close the tab, click it → app focuses/opens + routes to the room |
|
||
| Gal | MediaGallery lazy-decrypt (true virtualization deferred) | `room/MediaGallery.tsx` | H1 |
|
||
| a11y | aria-labels: edit-history / reaction / thread / reply | `message/*` (`FallbackContent`, `Reaction`, `Reply`) | I |
|
||
| P3-8 | Thread Panel (side drawer, chips, threaded receipts, thread composer) | `features/room/thread/*`, `RoomTimeline/RoomInput` | 6-step checklist in LOTUS_TODO §P3-8 |
|
||
| P4-4 | KaTeX math (`$…$`, `$$…$$`, data-mx-maths; lazy chunk) | `utils/mathParse.ts`, `components/math/` | send `$x^2$`, `$$\int f$$`, `$5 and $10` (stays text), math inside code block (stays text) |
|
||
| P4-8 | Encrypted-search cache (opt-in toggle, clear button, logout wipe) | `utils/searchCache.ts`, message-search | enable in search panel → search → reload → coverage persists; logout wipes |
|
||
| N97a | Session blob migration + cross-tab logout sync | `state/sessions.ts`, `useSessionSync` | login on old build → new build migrates; logout in tab A → tab B drops to auth |
|
||
| P4-1 | Slack-style thread notifications (participating default, All/Mentions/Mute, badge math) | `utils/threadNotifications.ts`, `ClientNonUIFeatures`, `roomToUnread` | 6-step checklist in LOTUS_TODO §P4-1 |
|
||
| AW-1 | Scheduled-message cancel no longer ghost-sends (error row on failure) | `ScheduledMessagesTray.tsx` | schedule → cancel with network cut → item stays + error; retry works |
|
||
| AW-2 | Emoji lazy-load (search/autocomplete/recents fill in; board opens fast) | `plugins/emoji.ts` + consumers | first emoji-board open of a session: grid+search populate; reactions still label |
|
||
| AW-3 | SW precache (repeat-visit near-instant; deploys still picked up immediately) | `sw.ts`, `vite.config.js` | load app twice (2nd = cached assets); deploy → reload picks new version |
|
||
| AW-4 | Desktop CSP tighten + Escape/panel fixes + thread Jump to Latest | `tauri.conf.json`, Room/ThreadPanel | desktop: boots, avatars/media load, VT323 font renders, location maps embed, calls connect, deep links work |
|
||
|
||
**Verified working in live testing (2026-06):** A2, B1–B4, C1, C3, D (mic/camera/deafen/screenshare/fullscreen/more-menu/PiP). Denoise quality in D is still poor — tracked under the denoise project, not a regression.
|
||
|
||
---
|
||
|
||
## 🧩 Element Call source-level items — now actionable via the fork
|
||
|
||
> 🔱 **[EC-FORK]** **UPDATE 2026-06-30: Phase 2 IMPLEMENTED.** We own and
|
||
> self-build Element Call (`LotusGuild/element-call` →
|
||
> `@lotusguild/element-call-embedded@0.20.1-lotus.1`, cinny wired). A5/A6/A7
|
||
> below are **fixed in the fork** — they are now ⚠️ awaiting **live
|
||
> verification** (`LOTUS_TESTING.md` §D2), not open work. See
|
||
> [`HANDOFF_ELEMENT_CALL_FORK.md`](./HANDOFF_ELEMENT_CALL_FORK.md) §10. Delete each
|
||
> row once verified live.
|
||
|
||
The in-call participant grid is rendered **inside EC's app** — now editable source
|
||
(previously a prebuilt npm bundle we could only style around). Status of the items
|
||
from testing:
|
||
|
||
- **A5 — "Focus camera": ⚠️ FIXED in fork, awaiting verify (D2-3).** cinny now
|
||
sends an `io.lotus.focus_participant` widget action that pins a participant in
|
||
EC's layout (coexisting with / overriding the screenshare spotlight); the old
|
||
`.click()`-the-tile DOM hack in `CallControl.ts` is deleted.
|
||
- **A6 — avatar decorations in-call: ⚠️ FIXED in fork, awaiting verify (D2-4).**
|
||
cinny pushes `io.lotus.decorations` (per-user APNG URLs) and the fork renders
|
||
them on EC's participant video-tile avatars — not just our pre-join lobby roster.
|
||
- **A7 — mic dead after EC's "Reconnect": ⚠️ FIXED in fork, awaiting verify
|
||
(D2-1).** Denoise moved into EC's mic-capture/publish pipeline as a first-class
|
||
LiveKit `TrackProcessor` (flag `lotusDenoiseSource=1`); EC re-runs it on every
|
||
(re)publish, so reconnects keep denoise alive natively. The build-time
|
||
`getUserMedia`/`index.html` injection (the root cause) is removed. **Highest
|
||
blast radius — everyone's mic; verify D2-1 carefully.**
|
||
|
||
---
|
||
|
||
## 🔴 Open — Actionable
|
||
|
||
### 🧨 Encryption / E2EE — ⚠️ EXTREME COMPLEXITY · 🧠 PLANNING SESSION REQUIRED · 👤 SENIOR ENGINEER
|
||
|
||
> 🧰 **Investigation kit ready (2026-07):** [`LOTUS_E2EE_INVESTIGATION.md`](./LOTUS_E2EE_INVESTIGATION.md)
|
||
> has the per-KE capture runbook (console signatures, synapse-side queries, the
|
||
> KE-1→KE-2 causality decision tree, ranked remediations), and the client now
|
||
> ships a **Crypto Diagnostics** capture helper (Settings) — run it during the
|
||
> next affected call and download the report before starting any fix.
|
||
|
||
> **Observed live in prod 2026-06-30** on `chat.lotusguild.org` during a 2-person
|
||
> **Element Call** (E2EE enabled). These span **client rust-crypto (via
|
||
> `matrix-js-sdk@41.6.0-rc.0`) ↔ Synapse ↔ Element Call's MatrixRTC E2EE** and are
|
||
> very likely **interrelated** (see KE-1 → KE-2). Do **not** spot-fix — they need
|
||
> a dedicated cross-system planning session with the homeserver owner. Capture
|
||
> full client console + a synapse-side trace for the same call before starting.
|
||
> **None of these are caused by the EC fork work** (the issues reproduce on the
|
||
> old build; the local mic/denoise path is unrelated to key distribution).
|
||
|
||
- **KE-1 — One-time-key (OTK) upload conflict storm (CRITICAL, root-cause candidate).**
|
||
`POST /_matrix/client/v3/keys/upload` returns `400 M_UNKNOWN: One time key
|
||
signed_curve25519:AAAAAAAAAGQ already exists. Old key: {…} new key: {…}` —
|
||
firing **continuously** (many/sec). The client repeatedly tries to publish an
|
||
OTK at a key id the server already holds **with a different value**, i.e. the
|
||
rust-crypto key store and Synapse have **diverged OTK state**. Impact: floods
|
||
the crypto outgoing-request loop and is the prime suspect for the downstream
|
||
missing-key failures (no fresh OTKs ⇒ no new Olm sessions ⇒ undecryptable
|
||
to-device key events). _Investigate:_ device/key-store reset-or-restore
|
||
mismatch, OTK id-counter desync, RC-SDK (`41.6.0-rc.0`) regression, or a
|
||
Synapse OTK bug. Repro signature: grep console for `already exists`.
|
||
**Extreme — planning session.**
|
||
|
||
- **KE-2 — Element Call media keys not arriving/decrypting → audio & video cut out (CRITICAL).**
|
||
`MissingKey: missing key at index N for participant @user`, `skipping decryption
|
||
due to missing key`, `MissingKey: key set not found for @user at index 0`, and
|
||
rust-crypto `WARN … Received an unexpected encrypted to-device event …
|
||
event_type="io.element.call.encryption_keys"`. EC distributes per-participant
|
||
media keys as **encrypted to-device `io.element.call.encryption_keys`** events;
|
||
these aren't being received/decrypted in order, so remote LiveKit audio/video
|
||
can't be decrypted — **this is the "friend's audio cuts out occasionally"
|
||
symptom.** Almost certainly downstream of **KE-1** (broken Olm sessions). Spans
|
||
EC's MatrixRTC E2EE + rust-crypto to-device + Synapse. **Extreme — planning
|
||
session.**
|
||
|
||
- **KE-3 — Timeline decryption error: missing `algorithm` field (HIGH).**
|
||
`Error decrypting event (… type=m.room.encrypted …): DecryptionError[msg:
|
||
missing field 'algorithm' at line 1 column 138 …]`. A malformed/legacy
|
||
encrypted event (or a serialization mismatch in the RC SDK) that rust-crypto
|
||
can't parse. Lower frequency than KE-1/2 but a distinct decode-path failure —
|
||
capture the offending event id (`$SASBBzoqj…` seen) and inspect its raw content.
|
||
|
||
- **KE-4 — MatrixRTC delayed-event / membership timeouts (MEDIUM-HIGH, reliability).**
|
||
`[MembershipManager] Network local timeout error while sending event, immediate
|
||
retry … AbortError: Restart delayed event timed out before the HS responded`,
|
||
with repeated `org.matrix.msc4157.update_delayed_event`. MSC4140/4157
|
||
delayed-event reliability against `matrix.lotusguild.org` — can cause stale/ghost
|
||
call membership and missed leave events. May be partly **homeserver
|
||
responsiveness**; correlate with synapse latency/load. Include in the same
|
||
planning session since it shares the call-reliability + HS-interaction surface.
|
||
|
||
### Security & Privacy
|
||
|
||
- **N97 — Access token stored in plaintext `localStorage`** (`state/sessions.ts`), vulnerable to XSS; device ID likewise. Architectural — needs a token-protection / session-storage redesign.
|
||
- **Session writes are non-atomic and not cross-tab synced** (`state/sessions.ts`) — risks inconsistent state / races across tabs.
|
||
- **Persisted PII without encryption:** user status message + expiry (`settings/account/Profile.tsx`), unsent composer drafts (`room/RoomInput.tsx`). Leak risk on shared devices.
|
||
|
||
### PWA / Offline / Notifications
|
||
|
||
- **N107 — SW has no `push` handler** — Web Push delivery is entirely non-functional. Needs a `push` listener + a Matrix push-gateway integration.
|
||
- **No app-asset caching strategy** (`src/sw.ts`) — no offline capability.
|
||
- ~~**`manifest: false`** may block PWA install~~ — **verified OK (2026-06):** `index.html` links `/manifest.json`, which exists in `public/` and is copied to `dist/`; VitePWA intentionally doesn't generate one. Not a bug.
|
||
|
||
### Dependencies & Build
|
||
|
||
- ~~**`matrix-js-sdk` pinned to a Release Candidate**~~ — **done (2026-07):** moved to `41.7.0` stable (crypto-wasm 18.3.1 security bump). Deep-audit dep triage: all 16 npm advisories are dev-only/unreachable/dead-dep — zero shipped exposure; dead `dompurify` removed. `@atlaskit`/build-tool pins remain review-worthy but low priority.
|
||
- **Build-time overhead:** `lotusDenoise` does heavy sequential `fs` work in `closeBundle`; `viteStaticCopy` config is complex with redundant renames — could be streamlined.
|
||
|
||
### Code Hygiene / DevEx
|
||
|
||
- **Automated test suite — 561+ tests across 65+ modules, a hard CI gate.** `npm test` runs Node's built-in runner via `tsx` (not vitest — Vite 8 is ahead of vitest's range) and **blocks the build job on failure**. Broad pure-logic coverage: utils (common, regex, sanitize/XSS, time, matrix, matrix-uia, mimeTypes, sort, accentColor, findAndReplace, AsyncSearch, ASCIILexicalTable, keyboard, room, matrix-crypto, featureCheck, syntaxHighlight, imageCompression, user-agent, callSounds), state (settings, sessions, recentSearches, upload, typingMembers, lists, room-list, toast, scheduledMessages, backupRestore, callEmbed/callPreferences, spaceRooms, …), plugins (matrix-to, call/utils, via-servers, bad-words, recent-emoji, custom-emoji, markdown block/inline/utils), OIDC (cs-api, useParsedLoginFlows, oidcState), lotus/avatarDecorations, message-search, search filters. Prevention work has caught + fixed **4 real bugs** (`findAndReplace` infinite-loop; `getSettings` crash-on-load when storage is blocked; `isMacOS` never matching modern Macs; `isMLDenoiseSupported` throwing `ReferenceError` instead of returning false on browsers lacking the `AudioWorkletNode` binding). **Next:** component/integration tests (the untestable-under-tsx DOM/React surface).
|
||
- **Extensive `as any` casts** across `src/` — gradual typing cleanup.
|
||
- **`types/matrix/` mirrors SDK types** instead of importing them — drift risk.
|
||
- ~~**Hardcoded CDN URL** should move to an env var~~ — **done:** `avatarDecorations.ts` already honors a `VITE_DECORATION_CDN` env override (lines 14-16); the in-repo literal is only the default. Nothing left.
|
||
- **`patch-folds.mjs` edits `node_modules` directly** — consider `patch-package`.
|
||
- **Infra docs:** `contrib/nginx` lacks security headers (HSTS/CSP) + uses rewrites over `try_files`; `contrib/caddy` has a placeholder path. CI/CD (`prod-deploy.yml`): sequential deploy, aggressive 1-min Netlify timeout, `package-manager-cache: false`.
|
||
- **README:** keep the fork-sync version + logo path current. (`CONTRIBUTING.md` is intentionally left as upstream Cinny's — not a Lotus concern.)
|
||
- **Architecture notes (low priority):** deep `features/` + `hooks/` nesting, many small coupled hooks, possible dead CSS/components, `SpacingVariant` / `DropTarget` recipe simplification.
|
||
- **Git workflow (forward-looking):** keep commits scoped — past monolithic "fix all bugs" commits and inconsistent prefixes hurt `git bisect`.
|
||
|
||
### Big Projects
|
||
|
||
- ~~**#5 — Seasonal themes & chat-background redesign.**~~ **DONE (2026-06/07):** 11 seasonal/holiday overlays shipped and later toned down + given a settings preview grid; all 19 chat backgrounds redesigned (Carbon + Aurora kept per user preference), one design sprint each, GPU-friendly CSS with `prefers-reduced-motion` + pause toggle. Remaining polish rides normal bug flow, not a "big project."
|