Compare commits
2 Commits
7c06b27c73
...
ebc782b16c
| Author | SHA1 | Date | |
|---|---|---|---|
| ebc782b16c | |||
| 7939dc92d4 |
@@ -548,8 +548,19 @@ roster capped. Only `https`/`blob` URLs accepted for inject/decoration assets.
|
|||||||
|
|
||||||
### 12.1 cinny host integration checklist (REQUIRED to light these up)
|
### 12.1 cinny host integration checklist (REQUIRED to light these up)
|
||||||
|
|
||||||
The EC side is additive and dormant until cinny opts in. Host work needed (in
|
> ✅ **STATUS (2026-06): COMPLETE.** All items below are shipped. call_state,
|
||||||
`src/app/plugins/call/CallEmbed.ts` unless noted):
|
> focus_participant, decorations, and transparent background are active; the
|
||||||
|
> in-source denoise cutover is done (flag `lotusDenoiseSource=1`, **all four**
|
||||||
|
> models in-source); and the two formerly-dormant capabilities now have cinny
|
||||||
|
> UI — **soundboard** (`io.lotus.inject_audio`, P5-15) and **quality controls +
|
||||||
|
> room permissions** (`io.lotus.set_quality` + `io.lotus.room_quality`, P5-31,
|
||||||
|
> with server-side enforcement in `LotusGuild/matrix`). See `LOTUS_FEATURES.md`
|
||||||
|
> → "Element Call — Self-Built Fork". The checklist is kept below as the record
|
||||||
|
> of what was wired. (One open denoise item tracked separately: the "Series
|
||||||
|
> Suppression" native-NS toggle is not wired to the real call path.)
|
||||||
|
|
||||||
|
The EC side is additive and dormant until cinny opts in. Host work (in
|
||||||
|
`src/app/plugins/call/CallEmbed.ts` unless noted) — **done**:
|
||||||
|
|
||||||
> ⚠️ **CRITICAL TIMING (protocol audit F1):** only send `io.lotus.*` **toWidget**
|
> ⚠️ **CRITICAL TIMING (protocol audit F1):** only send `io.lotus.*` **toWidget**
|
||||||
> actions (#3 focus, #6 decorations, #7 quality, audio-inject) **after** the call
|
> actions (#3 focus, #6 decorations, #7 quality, audio-inject) **after** the call
|
||||||
@@ -559,16 +570,16 @@ The EC side is additive and dormant until cinny opts in. Host work needed (in
|
|||||||
> leaves the host's `transport.send` pending until the **10s timeout**. Queue and
|
> leaves the host's `transport.send` pending until the **10s timeout**. Queue and
|
||||||
> flush on join, or no-op before join.
|
> flush on join, or no-op before join.
|
||||||
>
|
>
|
||||||
> Also: **F3** — the fork implements only `rnnoise`/`speex`; cinny's `dtln`/
|
> Also: **F3 (RESOLVED)** — all four models (`rnnoise`/`speex`/`dtln`/
|
||||||
> `deepfilternet` selections silently fall back to rnnoise (now logged). Restrict
|
> `deepfilternet`) are now implemented in-source in `lotusDenoiseProcessor.ts`;
|
||||||
> the embedded-call model picker to rnnoise/speex, or implement the others in
|
> the picker offers all four. **F4** — cinny no longer forwards a native-NS flag
|
||||||
> `lotusDenoiseProcessor.ts`. **F4** — cinny sends `lotusNativeNS`, which the
|
> in the `ml` branch (the "Series Suppression" toggle is currently a no-op in
|
||||||
> fork ignores; drop it or wire it in. **F7** — no widget _capability_ changes
|
> real calls — open item). **F7** — no widget _capability_ changes needed;
|
||||||
> needed; custom actions bypass capability checks.
|
> custom actions bypass capability checks.
|
||||||
|
|
||||||
1. **Set the URL flags** on the widget iframe params (the `URLSearchParams` in
|
1. **Set the URL flags** on the widget iframe params (the `URLSearchParams` in
|
||||||
`CallEmbed`): `lotusCallState=1`, `lotusTransparent=1`/`lotusTheme=1`,
|
`CallEmbed`): `lotusCallState=1`, `lotusTransparent=1`/`lotusTheme=1`,
|
||||||
`lotusAudioInject=1` as desired. (Denoise already sets `lotusDenoise=ml` etc.)
|
`lotusAudioInject=1` as desired. (Denoise sets `lotusDenoiseSource=1` + `lotusModel`/`lotusGate`/`lotusGateThreshold` in the `ml` tier.)
|
||||||
2. **Ack `io.lotus.call_state`**: add `listenAction('io.lotus.call_state', …)` —
|
2. **Ack `io.lotus.call_state`**: add `listenAction('io.lotus.call_state', …)` —
|
||||||
without a reply the fork's sends time out every 250ms. Feed the payload into
|
without a reply the fork's sends time out every 250ms. Feed the payload into
|
||||||
`useCallSpeakers` and RETIRE its `contentDocument` DOM scrape.
|
`useCallSpeakers` and RETIRE its `contentDocument` DOM scrape.
|
||||||
|
|||||||
+1
-1
@@ -560,7 +560,7 @@ Exhaustive, low-level implementation details for backlog items. Follow these pat
|
|||||||
|
|
||||||
> ⚠️ **[Gemini_Found — CORRECTED]** Gemini originally suggested using LiveKit's `LocalAudioTrack.replaceTrack()` to mix audio into the call stream. This is **not possible** from Lotus Chat's realm: Element Call runs in a **cross-origin iframe** controlled via `matrix-widget-api` (postMessage). LiveKit's JS SDK and its `LocalAudioTrack` live inside EC's sandboxed context — inaccessible from our code. This directly contradicts the confirmed constraint already listed in the Server Capabilities table: _"Cindy CANNOT inject audio into EC call stream — In-call soundboard must be redesigned as local-only."_ The soundboard must be a local-playback-only feature (output through the user's speakers, not mixed into the call audio stream).
|
> ⚠️ **[Gemini_Found — CORRECTED]** Gemini originally suggested using LiveKit's `LocalAudioTrack.replaceTrack()` to mix audio into the call stream. This is **not possible** from Lotus Chat's realm: Element Call runs in a **cross-origin iframe** controlled via `matrix-widget-api` (postMessage). LiveKit's JS SDK and its `LocalAudioTrack` live inside EC's sandboxed context — inaccessible from our code. This directly contradicts the confirmed constraint already listed in the Server Capabilities table: _"Cindy CANNOT inject audio into EC call stream — In-call soundboard must be redesigned as local-only."_ The soundboard must be a local-playback-only feature (output through the user's speakers, not mixed into the call audio stream).
|
||||||
>
|
>
|
||||||
> 🔱 **[EC-FORK — RESOLVED]** Both the original claim and the earlier "practical blocker still holds" correction are now **outdated**. EC is same-origin **and** we own the source, so we no longer reach into EC's module scope from cinny — instead the fork **exposes the inject point itself**: the `io.lotus.inject_audio` widget action (`LotusWidgetActions.InjectAudio`) publishes a clip as a separate LiveKit track from inside EC. A **real** in-call soundboard (mixed into the call, not local-only) is therefore unblocked; only the cinny-side UI remains (see P5-15 above). The capability ships dormant today.
|
> 🔱 **[EC-FORK — RESOLVED]** Both the original claim and the earlier "practical blocker still holds" correction are now **outdated**. EC is same-origin **and** we own the source, so we no longer reach into EC's module scope from cinny — instead the fork **exposes the inject point itself**: the `io.lotus.inject_audio` widget action (`LotusWidgetActions.InjectAudio`) publishes a clip as a separate LiveKit track from inside EC. A **real** in-call soundboard (mixed into the call, not local-only) is therefore unblocked, and the cinny-side soundboard UI is now **built** (P5-15 above): uploadable clips played into the call via this action, stored in `io.lotus.soundboard` account data.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|||||||
@@ -52,6 +52,9 @@ The Lotus Chat logo (`public/res/Lotus.png`) is a derivative work based on the o
|
|||||||
- AFK auto-mute: mic is automatically silenced after a configurable idle timeout (1–30 min); a toast confirms the action
|
- AFK auto-mute: mic is automatically silenced after a configurable idle timeout (1–30 min); a toast confirms the action
|
||||||
- Voice channel user limit: admins can cap how many people can be in a room's call — enforced server-side for every Matrix client (not just Lotus Chat); others see "Channel Full" until a spot opens
|
- Voice channel user limit: admins can cap how many people can be in a room's call — enforced server-side for every Matrix client (not just Lotus Chat); others see "Channel Full" until a spot opens
|
||||||
- Custom join/leave sound effects when someone enters or leaves your call — choose Chime, Soft, Retro, or off
|
- Custom join/leave sound effects when someone enters or leaves your call — choose Chime, Soft, Retro, or off
|
||||||
|
- Soundboard: upload your own short audio clips (like custom emojis — they sync across your devices) and play them into a call so everyone hears them
|
||||||
|
- Call quality settings: cap your microphone bitrate, screenshare bitrate, and screenshare framerate — handy on a slow connection (Settings → Calls)
|
||||||
|
- Room call permissions: admins can turn off screen sharing or make a room audio-only (no cameras) — enforced server-side for every Matrix client, and it stops an in-progress share within seconds of being switched off
|
||||||
|
|
||||||
### Customization & Appearance
|
### Customization & Appearance
|
||||||
|
|
||||||
|
|||||||
@@ -138,9 +138,9 @@ export class CallEmbed {
|
|||||||
themeKind: ElementCallThemeKind,
|
themeKind: ElementCallThemeKind,
|
||||||
denoiseMode: NoiseSuppressionMode = 'browser',
|
denoiseMode: NoiseSuppressionMode = 'browser',
|
||||||
denoiseModel: string = 'rnnoise',
|
denoiseModel: string = 'rnnoise',
|
||||||
// [lotus] no longer used by the in-source denoise path; kept positionally
|
// [lotus] "Series suppression": also run EC's built-in WebRTC NS before the
|
||||||
// for callers. Prefixed with _ to satisfy no-unused-vars.
|
// in-source ML model (opt-in test aid for stacking browser NS + ML).
|
||||||
_denoiseNativeNS: boolean = true,
|
denoiseNativeNS: boolean = false,
|
||||||
denoiseGate: boolean = false,
|
denoiseGate: boolean = false,
|
||||||
denoiseGateThreshold: number = -45,
|
denoiseGateThreshold: number = -45,
|
||||||
initialAudio = true,
|
initialAudio = true,
|
||||||
@@ -166,10 +166,14 @@ export class CallEmbed {
|
|||||||
perParticipantE2EE: room.hasEncryptionStateEvent().toString(),
|
perParticipantE2EE: room.hasEncryptionStateEvent().toString(),
|
||||||
lang: 'en-EN',
|
lang: 'en-EN',
|
||||||
theme: themeKind,
|
theme: themeKind,
|
||||||
// EC's built-in WebRTC suppressor: on only for 'browser' tier. For 'ml'
|
// EC's built-in WebRTC suppressor: on for the 'browser' tier, and for the
|
||||||
// we disable it so EC captures a raw mic and the fork's in-source denoise
|
// 'ml' tier only when "series suppression" is opted into (stack browser NS
|
||||||
// TrackProcessor (lotusDenoiseSource) handles the pipeline.
|
// before the fork's in-source ML model). Plain 'ml' keeps it OFF so the
|
||||||
noiseSuppression: (denoiseMode === 'browser').toString(),
|
// fork's TrackProcessor (lotusDenoiseSource) gets a raw mic.
|
||||||
|
noiseSuppression: (
|
||||||
|
denoiseMode === 'browser' ||
|
||||||
|
(denoiseMode === 'ml' && denoiseNativeNS)
|
||||||
|
).toString(),
|
||||||
audio: initialAudio.toString(),
|
audio: initialAudio.toString(),
|
||||||
video: initialVideo.toString(),
|
video: initialVideo.toString(),
|
||||||
header: 'none',
|
header: 'none',
|
||||||
|
|||||||
@@ -236,9 +236,13 @@ const defaultSettings: Settings = {
|
|||||||
perMessageProfiles: false,
|
perMessageProfiles: false,
|
||||||
|
|
||||||
cameraOnJoin: false,
|
cameraOnJoin: false,
|
||||||
|
// Tier default stays browser-native (known-good; best-perceived in testing so
|
||||||
|
// far). If a user opts into the ML tier, default to the highest-quality model.
|
||||||
callNoiseSuppression: 'browser',
|
callNoiseSuppression: 'browser',
|
||||||
callDenoiseModel: 'rnnoise',
|
callDenoiseModel: 'deepfilternet',
|
||||||
callDenoiseNativeNS: true,
|
// "Series suppression" (stack the browser's native NS before the ML model) is
|
||||||
|
// off by default — best practice is a single NS stage; it's an opt-in test aid.
|
||||||
|
callDenoiseNativeNS: false,
|
||||||
callDenoiseGate: false,
|
callDenoiseGate: false,
|
||||||
callDenoiseGateThreshold: -45,
|
callDenoiseGateThreshold: -45,
|
||||||
pttMode: false,
|
pttMode: false,
|
||||||
|
|||||||
@@ -1,18 +1,14 @@
|
|||||||
import { test, beforeEach, afterEach } from 'node:test';
|
import { test, beforeEach, afterEach } from 'node:test';
|
||||||
import assert from 'node:assert/strict';
|
import assert from 'node:assert/strict';
|
||||||
|
|
||||||
import {
|
import { DENOISE_MODELS, ML_DENOISE_REQUIREMENTS, isMLDenoiseSupported } from './lotusDenoiseUtils';
|
||||||
DENOISE_MODELS,
|
|
||||||
ML_DENOISE_REQUIREMENTS,
|
|
||||||
isMLDenoiseSupported,
|
|
||||||
} from './lotusDenoiseUtils';
|
|
||||||
|
|
||||||
// ── Model catalog (data integrity) ──────────────────────────────────────────
|
// ── Model catalog (data integrity) ──────────────────────────────────────────
|
||||||
|
|
||||||
test('DENOISE_MODELS lists the four expected models in order', () => {
|
test('DENOISE_MODELS lists the four models ordered best-quality (highest CPU) first', () => {
|
||||||
assert.deepEqual(
|
assert.deepEqual(
|
||||||
DENOISE_MODELS.map((m) => m.id),
|
DENOISE_MODELS.map((m) => m.id),
|
||||||
['rnnoise', 'speex', 'dtln', 'deepfilternet'],
|
['deepfilternet', 'dtln', 'rnnoise', 'speex'],
|
||||||
);
|
);
|
||||||
});
|
});
|
||||||
|
|
||||||
|
|||||||
@@ -1,5 +1,8 @@
|
|||||||
/**
|
/**
|
||||||
* Detection utilities for Lotus ML noise suppression (RNNoise).
|
* Detection utilities + model catalog for Lotus ML noise suppression
|
||||||
|
* (DeepFilterNet 3 / DTLN / RNNoise / Speex). The catalog is ordered by
|
||||||
|
* quality (and, correspondingly, CPU cost) — highest first — and drives the
|
||||||
|
* order of the model dropdown in settings.
|
||||||
*/
|
*/
|
||||||
|
|
||||||
import { DenoiseModelId } from '../state/settings';
|
import { DenoiseModelId } from '../state/settings';
|
||||||
@@ -14,42 +17,47 @@ export type DenoiseModel = {
|
|||||||
voiceQuality: 'Moderate' | 'High' | 'Very High';
|
voiceQuality: 'Moderate' | 'High' | 'Very High';
|
||||||
};
|
};
|
||||||
|
|
||||||
|
// Ordered best-quality (highest CPU) first — this is the dropdown order.
|
||||||
export const DENOISE_MODELS: DenoiseModel[] = [
|
export const DENOISE_MODELS: DenoiseModel[] = [
|
||||||
{
|
{
|
||||||
id: 'rnnoise',
|
id: 'deepfilternet',
|
||||||
name: 'RNNoise',
|
name: 'DeepFilterNet 3 (beta)',
|
||||||
description: 'Lightweight hybrid model. Best for consistent noise like fans.',
|
description:
|
||||||
cpuUsage: '< 5%',
|
'Studio-grade deep-learning model (48 kHz fullband, ONNX). Best quality; highest CPU and a larger one-time download.',
|
||||||
binarySize: '< 1 MB',
|
cpuUsage: '25-50%',
|
||||||
transients: 'Good',
|
binarySize: '~18 MB',
|
||||||
voiceQuality: 'High',
|
transients: 'Excellent',
|
||||||
},
|
voiceQuality: 'Very High',
|
||||||
{
|
|
||||||
id: 'speex',
|
|
||||||
name: 'Speex (Legacy)',
|
|
||||||
description: 'Classic DSP noise suppressor. Minimal CPU, gentler on voice.',
|
|
||||||
cpuUsage: '< 2%',
|
|
||||||
binarySize: '< 1 MB',
|
|
||||||
transients: 'Poor',
|
|
||||||
voiceQuality: 'Moderate',
|
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
id: 'dtln',
|
id: 'dtln',
|
||||||
name: 'DTLN (beta)',
|
name: 'DTLN (beta)',
|
||||||
description: 'Deep-learning model (TFLite). Stronger on transient noise; higher CPU.',
|
description:
|
||||||
|
'Dual-signal deep-learning model (16 kHz). Strong on transient noise; moderate CPU.',
|
||||||
cpuUsage: '10-20%',
|
cpuUsage: '10-20%',
|
||||||
binarySize: '~4 MB',
|
binarySize: '~4 MB',
|
||||||
transients: 'Excellent',
|
transients: 'Excellent',
|
||||||
voiceQuality: 'High',
|
voiceQuality: 'High',
|
||||||
},
|
},
|
||||||
{
|
{
|
||||||
id: 'deepfilternet',
|
id: 'rnnoise',
|
||||||
name: 'DeepFilterNet 3 (beta)',
|
name: 'RNNoise',
|
||||||
description: 'Studio-grade deep-learning model (48 kHz, ONNX). Best quality; highest CPU.',
|
description:
|
||||||
cpuUsage: '25-50%',
|
'Lightweight hybrid model (48 kHz). Very low CPU; good for steady noise like fans, but can sound processed at full strength.',
|
||||||
binarySize: '~18 MB',
|
cpuUsage: '< 5%',
|
||||||
transients: 'Excellent',
|
binarySize: '< 1 MB',
|
||||||
voiceQuality: 'Very High',
|
transients: 'Good',
|
||||||
|
voiceQuality: 'Moderate',
|
||||||
|
},
|
||||||
|
{
|
||||||
|
id: 'speex',
|
||||||
|
name: 'Speex (Legacy)',
|
||||||
|
description:
|
||||||
|
'Classic DSP noise suppressor. Minimal CPU, gentlest on voice; weakest suppression.',
|
||||||
|
cpuUsage: '< 2%',
|
||||||
|
binarySize: '< 1 MB',
|
||||||
|
transients: 'Poor',
|
||||||
|
voiceQuality: 'Moderate',
|
||||||
},
|
},
|
||||||
];
|
];
|
||||||
|
|
||||||
@@ -67,8 +75,14 @@ export const isMLDenoiseSupported = (): boolean => {
|
|||||||
// instead of returning false.
|
// instead of returning false.
|
||||||
const hasAudioWorklet = hasAudioContext && typeof AudioWorkletNode !== 'undefined';
|
const hasAudioWorklet = hasAudioContext && typeof AudioWorkletNode !== 'undefined';
|
||||||
const hasGetUserMedia = !!(navigator.mediaDevices && navigator.mediaDevices.getUserMedia);
|
const hasGetUserMedia = !!(navigator.mediaDevices && navigator.mediaDevices.getUserMedia);
|
||||||
|
// Every ML model compiles WebAssembly (and DFN/DTLN load worklets via blob
|
||||||
|
// URLs). Under a strict CSP without `wasm-unsafe-eval` (e.g. some desktop/Tauri
|
||||||
|
// shells) WASM is unavailable, so gate on it — otherwise we'd offer ML and then
|
||||||
|
// silently fall back to the raw mic in-call.
|
||||||
|
const hasWasm =
|
||||||
|
typeof WebAssembly !== 'undefined' && typeof WebAssembly.instantiate === 'function';
|
||||||
|
|
||||||
return hasAudioWorklet && hasGetUserMedia;
|
return hasAudioWorklet && hasGetUserMedia && hasWasm;
|
||||||
};
|
};
|
||||||
|
|
||||||
/**
|
/**
|
||||||
@@ -77,6 +91,6 @@ export const isMLDenoiseSupported = (): boolean => {
|
|||||||
export const ML_DENOISE_REQUIREMENTS = [
|
export const ML_DENOISE_REQUIREMENTS = [
|
||||||
'Modern browser with Web Audio API support',
|
'Modern browser with Web Audio API support',
|
||||||
'AudioWorklet support (Chrome 66+, Firefox 76+, Safari 14.1+)',
|
'AudioWorklet support (Chrome 66+, Firefox 76+, Safari 14.1+)',
|
||||||
|
'WebAssembly (WASM) support',
|
||||||
'Microphone access',
|
'Microphone access',
|
||||||
'48kHz AudioContext capability',
|
|
||||||
];
|
];
|
||||||
|
|||||||
Reference in New Issue
Block a user