feat(calls): implement advanced multi-model ML noise suppression system
Implement a flexible, multi-model noise suppression pipeline for Element Call/LiveKit integration: - ML Engines: Added support for RNNoise, Speex, DTLN, and DeepFilterNet 3 models. - Pipeline Architecture: Implemented modular audio processing in lotus-denoise.js, supporting 'Series Suppression' (running browser-native NSNet2 before ML) and a hardware-style Noise Gate. - UI & UX Enhancements: - Settings UI: Added model comparison chart with CPU/Quality metadata. - Tuning: Added Live Microphone Meter for calibrating Noise Gate thresholds. - Reporting: Added LotusToast system to alert users when ML suppression fails or falls back to raw input. - Robustness & Quality: - Capture Fidelity: Removed forced 48kHz capture constraints to allow native-rate capture (solving static issues with high-end audio interfaces). - Performance: Added WASM SIMD detection with transparent fallback. - Capability Detection: Added browser feature detection to disable unsupported ML modes. - Build Integration: Updated Vite config to self-host all model WASM/tflite assets in /denoise/ directory.
This commit is contained in:
+28
-17
@@ -405,32 +405,43 @@ A local sound plays when another participant joins or leaves a call you're in.
|
||||
|
||||
Files: `src/app/utils/callSounds.ts`, `src/app/hooks/useCallJoinLeaveSounds.ts`
|
||||
|
||||
### Noise Suppression (3-Tier, incl. on-device ML) (P5-30)
|
||||
### Noise Suppression (Advanced Multi-Tier) (P5-30)
|
||||
|
||||
A three-way mic noise-suppression control in **Settings → General → Calls**:
|
||||
A comprehensive mic noise-suppression system in **Settings → General → Calls** designed for high-end hardware and detailed performance testing.
|
||||
|
||||
| Tier | What it does |
|
||||
| Tier | Description |
|
||||
| ------------------ | ----------------------------------------------------------------------------- |
|
||||
| **Off** | No suppression (`noiseSuppression=false` to Element Call). |
|
||||
| **Browser-native** | Element Call's built-in WebRTC suppressor (`noiseSuppression=true`). Default. |
|
||||
| **ML (beta)** | On-device RNNoise — Krisp-style removal of fans, keyboards, dogs, etc. |
|
||||
| **Off** | No suppression applied. |
|
||||
| **Browser-native** | Google NSNet2 (WebRTC built-in). Best general performance/CPU balance. |
|
||||
| **ML (Advanced)** | Custom ML pipeline supporting multiple models, series suppression, and gates. |
|
||||
|
||||
**Why a shim, not a fork:** Element Call captures the mic _inside_ its iframe and publishes to LiveKit; the host can't reach that track. LiveKit's Krisp filter is LiveKit-Cloud-only (we self-host the SFU), and EC's own RNNoise work (PR #3892) is unmerged. So the **ML tier** is delivered by injecting a same-origin pre-init script into the vendored EC `index.html` that monkeypatches `getUserMedia` and routes the captured mic through an RNNoise `AudioWorklet` (`@sapphi-red/web-noise-suppressor`) before LiveKit ever sees it — the same post-capture pipeline #3892 uses, executed from the realm we already control. Works on the self-hosted LiveKit SFU, survives EC version bumps, no EC fork/AGPL/rebase burden.
|
||||
**Advanced Features & Test Options:**
|
||||
- **Multiple ML Models:** Toggle between **RNNoise** (standard hybrid) and **Speex** (legacy DSP-based) to compare artifact levels and suppression strength.
|
||||
- **Series Suppression (Combination):** Optional toggle to run the browser's native stationary noise filter *before* the ML model. This allows testing the individual performance of the ML model vs the combined effectiveness at removing fan hum.
|
||||
- **Noise Gate:** Configurable hardware-style gate with a dB threshold. Hard-cuts all audio when input is below the threshold, ensuring absolute silence between sentences.
|
||||
- **Live Microphone Meter:** A real-time volume visualizer in the settings panel to help users accurately tune their Noise Gate threshold.
|
||||
- **High-Fidelity Capture:** Captures at hardware native rates (supporting high-end gear like **Scarlett Solo + PodMic**) and handles high-quality resampling via Web Audio to prevent the "static" artifacts caused by low-quality browser pre-resamplers.
|
||||
- **Performance:** Automatic WASM SIMD detection with transparent fallback to standard binaries.
|
||||
- **Support Detection:** UI now detects `AudioWorklet` / `AudioContext` support and disables ML options in unsupported environments.
|
||||
- **Status Reporting:** The ML shim notifies the host app via `postMessage`. If initialization fails, a system toast alerts the user of the fallback to the raw microphone.
|
||||
|
||||
**How it's wired:**
|
||||
**Open-Source Model Roadmap:**
|
||||
| Model | Transients (Clicks) | Voice Quality | CPU Usage (WASM) |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| **RNNoise** | Poor | Moderate | < 5% |
|
||||
| **DTLN** | Good | High | 10-20% |
|
||||
| **DeepFilterNet 3** | **Excellent** | **Very High** | 25-50%+ |
|
||||
|
||||
- `callNoiseSuppression` setting is `'off' | 'browser' | 'ml'` (legacy boolean migrates: `true`→`browser`, `false`→`off`)
|
||||
- `CallEmbed.getWidget()` maps the tier to the `noiseSuppression` URL param and appends `lotusDenoise=ml` for the ML tier (browser-native suppressor is disabled in ML mode so RNNoise owns suppression)
|
||||
- The `lotusDenoise` vite plugin copies the RNNoise worklet + wasm into `public/element-call/denoise/`, copies the shim, and injects `<script src="./lotus-denoise.js">` before EC's module entry
|
||||
- The shim keeps `echoCancellation`/`autoGainControl` on the raw capture and falls back to the raw mic if RNNoise setup fails, so calls never break
|
||||
|
||||
**Known beta caveat:** routing capture through WebAudio can weaken the browser's acoustic echo cancellation (AEC runs on the native capture track) — the same tradeoff EC's upstream feature makes; hence the "beta" label.
|
||||
> **Note:** DeepFilterNet 3 is planned for future inclusion in the desktop build where larger binaries and higher CPU overhead are more acceptable.
|
||||
|
||||
### Files
|
||||
|
||||
- `build/lotus-denoise.js` — injected RNNoise getUserMedia shim (classic script)
|
||||
- `vite.config.js` — `lotusDenoise()` plugin (asset copy + index.html injection)
|
||||
- `src/app/plugins/call/CallEmbed.ts` — tier → widget URL params
|
||||
- `build/lotus-denoise.js` — multi-model getUserMedia shim
|
||||
- `vite.config.js` — `lotusDenoise()` plugin (copies assets for RNNoise, Speex, and NoiseGate)
|
||||
- `src/app/plugins/call/CallEmbed.ts` — advanced tier → widget URL params
|
||||
- `src/app/utils/lotusDenoiseUtils.ts` — support detection and model comparison metadata
|
||||
- `src/app/features/settings/general/General.tsx` — advanced settings UI + mic meter
|
||||
|
||||
|
||||
### Call Button Scoping
|
||||
|
||||
|
||||
Reference in New Issue
Block a user