Compare commits
1 Commits
938ead79f7
...
5d5f5f4516
| Author | SHA1 | Date | |
|---|---|---|---|
| 5d5f5f4516 |
+28
-17
@@ -405,32 +405,43 @@ A local sound plays when another participant joins or leaves a call you're in.
|
||||
|
||||
Files: `src/app/utils/callSounds.ts`, `src/app/hooks/useCallJoinLeaveSounds.ts`
|
||||
|
||||
### Noise Suppression (3-Tier, incl. on-device ML) (P5-30)
|
||||
### Noise Suppression (Advanced Multi-Tier) (P5-30)
|
||||
|
||||
A three-way mic noise-suppression control in **Settings → General → Calls**:
|
||||
A comprehensive mic noise-suppression system in **Settings → General → Calls** designed for high-end hardware and detailed performance testing.
|
||||
|
||||
| Tier | What it does |
|
||||
| Tier | Description |
|
||||
| ------------------ | ----------------------------------------------------------------------------- |
|
||||
| **Off** | No suppression (`noiseSuppression=false` to Element Call). |
|
||||
| **Browser-native** | Element Call's built-in WebRTC suppressor (`noiseSuppression=true`). Default. |
|
||||
| **ML (beta)** | On-device RNNoise — Krisp-style removal of fans, keyboards, dogs, etc. |
|
||||
| **Off** | No suppression applied. |
|
||||
| **Browser-native** | Google NSNet2 (WebRTC built-in). Best general performance/CPU balance. |
|
||||
| **ML (Advanced)** | Custom ML pipeline supporting multiple models, series suppression, and gates. |
|
||||
|
||||
**Why a shim, not a fork:** Element Call captures the mic _inside_ its iframe and publishes to LiveKit; the host can't reach that track. LiveKit's Krisp filter is LiveKit-Cloud-only (we self-host the SFU), and EC's own RNNoise work (PR #3892) is unmerged. So the **ML tier** is delivered by injecting a same-origin pre-init script into the vendored EC `index.html` that monkeypatches `getUserMedia` and routes the captured mic through an RNNoise `AudioWorklet` (`@sapphi-red/web-noise-suppressor`) before LiveKit ever sees it — the same post-capture pipeline #3892 uses, executed from the realm we already control. Works on the self-hosted LiveKit SFU, survives EC version bumps, no EC fork/AGPL/rebase burden.
|
||||
**Advanced Features & Test Options:**
|
||||
- **Multiple ML Models:** Toggle between **RNNoise** (standard hybrid) and **Speex** (legacy DSP-based) to compare artifact levels and suppression strength.
|
||||
- **Series Suppression (Combination):** Optional toggle to run the browser's native stationary noise filter *before* the ML model. This allows testing the individual performance of the ML model vs the combined effectiveness at removing fan hum.
|
||||
- **Noise Gate:** Configurable hardware-style gate with a dB threshold. Hard-cuts all audio when input is below the threshold, ensuring absolute silence between sentences.
|
||||
- **Live Microphone Meter:** A real-time volume visualizer in the settings panel to help users accurately tune their Noise Gate threshold.
|
||||
- **High-Fidelity Capture:** Captures at hardware native rates (supporting high-end gear like **Scarlett Solo + PodMic**) and handles high-quality resampling via Web Audio to prevent the "static" artifacts caused by low-quality browser pre-resamplers.
|
||||
- **Performance:** Automatic WASM SIMD detection with transparent fallback to standard binaries.
|
||||
- **Support Detection:** UI now detects `AudioWorklet` / `AudioContext` support and disables ML options in unsupported environments.
|
||||
- **Status Reporting:** The ML shim notifies the host app via `postMessage`. If initialization fails, a system toast alerts the user of the fallback to the raw microphone.
|
||||
|
||||
**How it's wired:**
|
||||
**Open-Source Model Roadmap:**
|
||||
| Model | Transients (Clicks) | Voice Quality | CPU Usage (WASM) |
|
||||
| :--- | :--- | :--- | :--- |
|
||||
| **RNNoise** | Poor | Moderate | < 5% |
|
||||
| **DTLN** | Good | High | 10-20% |
|
||||
| **DeepFilterNet 3** | **Excellent** | **Very High** | 25-50%+ |
|
||||
|
||||
- `callNoiseSuppression` setting is `'off' | 'browser' | 'ml'` (legacy boolean migrates: `true`→`browser`, `false`→`off`)
|
||||
- `CallEmbed.getWidget()` maps the tier to the `noiseSuppression` URL param and appends `lotusDenoise=ml` for the ML tier (browser-native suppressor is disabled in ML mode so RNNoise owns suppression)
|
||||
- The `lotusDenoise` vite plugin copies the RNNoise worklet + wasm into `public/element-call/denoise/`, copies the shim, and injects `<script src="./lotus-denoise.js">` before EC's module entry
|
||||
- The shim keeps `echoCancellation`/`autoGainControl` on the raw capture and falls back to the raw mic if RNNoise setup fails, so calls never break
|
||||
|
||||
**Known beta caveat:** routing capture through WebAudio can weaken the browser's acoustic echo cancellation (AEC runs on the native capture track) — the same tradeoff EC's upstream feature makes; hence the "beta" label.
|
||||
> **Note:** DeepFilterNet 3 is planned for future inclusion in the desktop build where larger binaries and higher CPU overhead are more acceptable.
|
||||
|
||||
### Files
|
||||
|
||||
- `build/lotus-denoise.js` — injected RNNoise getUserMedia shim (classic script)
|
||||
- `vite.config.js` — `lotusDenoise()` plugin (asset copy + index.html injection)
|
||||
- `src/app/plugins/call/CallEmbed.ts` — tier → widget URL params
|
||||
- `build/lotus-denoise.js` — multi-model getUserMedia shim
|
||||
- `vite.config.js` — `lotusDenoise()` plugin (copies assets for RNNoise, Speex, and NoiseGate)
|
||||
- `src/app/plugins/call/CallEmbed.ts` — advanced tier → widget URL params
|
||||
- `src/app/utils/lotusDenoiseUtils.ts` — support detection and model comparison metadata
|
||||
- `src/app/features/settings/general/General.tsx` — advanced settings UI + mic meter
|
||||
|
||||
|
||||
### Call Button Scoping
|
||||
|
||||
|
||||
+143
-78
@@ -12,18 +12,19 @@
|
||||
*
|
||||
* RNNoise REQUIRES mono, 48 kHz float audio. Feeding it anything else (stereo,
|
||||
* or 44.1 kHz data the model treats as 48 kHz) produces loud static. So we:
|
||||
* - request mono + 48 kHz capture,
|
||||
* - run a 48 kHz AudioContext and BAIL to the raw mic if the browser refuses
|
||||
* to give us a real 48 kHz context,
|
||||
* - use the non-SIMD wasm (the SIMD build has produced artifacts on some GPUs).
|
||||
* - run a 48 kHz AudioContext (which handles resampling from the hardware),
|
||||
* - use the SIMD build if supported for better performance,
|
||||
* - keep browser-native stationary suppression ON so the fans are removed
|
||||
* before RNNoise focuses on transient noises (keyboard, dogs, etc.).
|
||||
*
|
||||
* Any failure falls back to the unprocessed mic so calls never break.
|
||||
*/
|
||||
(function () {
|
||||
'use strict';
|
||||
|
||||
var params;
|
||||
try {
|
||||
var params = new URLSearchParams(window.location.search);
|
||||
params = new URLSearchParams(window.location.search);
|
||||
if (params.get('lotusDenoise') !== 'ml') return;
|
||||
} catch (e) {
|
||||
return;
|
||||
@@ -33,77 +34,150 @@
|
||||
if (!md || typeof md.getUserMedia !== 'function') return;
|
||||
if (typeof AudioWorkletNode === 'undefined' || typeof AudioContext === 'undefined') return;
|
||||
|
||||
var PROCESSOR_NAME = '@sapphi-red/web-noise-suppressor/rnnoise';
|
||||
var ASSET_BASE = './denoise/';
|
||||
var SAMPLE_RATE = 48000; // RNNoise worklet assumes 48kHz
|
||||
var SAMPLE_RATE = 48000;
|
||||
|
||||
var MODEL = params.get('lotusModel') || 'rnnoise';
|
||||
var USE_NATIVE_NS = params.get('lotusNativeNS') === 'true';
|
||||
var USE_GATE = params.get('lotusGate') === 'true';
|
||||
var GATE_THRESHOLD = parseFloat(params.get('lotusGateThreshold') || '-45');
|
||||
|
||||
var PROCESSORS = {
|
||||
rnnoise: {
|
||||
name: '@sapphi-red/web-noise-suppressor/rnnoise',
|
||||
script: 'rnnoiseWorklet.js',
|
||||
wasm: 'rnnoise.wasm',
|
||||
simdWasm: 'rnnoise_simd.wasm',
|
||||
},
|
||||
speex: {
|
||||
name: '@sapphi-red/web-noise-suppressor/speex',
|
||||
script: 'speexWorklet.js',
|
||||
wasm: 'speex.wasm',
|
||||
},
|
||||
dtln: {
|
||||
name: '@workadventure/noise-suppression/processor',
|
||||
script: 'dtlnWorklet.js',
|
||||
},
|
||||
gate: {
|
||||
name: '@sapphi-red/web-noise-suppressor/noise-gate',
|
||||
script: 'noiseGateWorklet.js',
|
||||
},
|
||||
};
|
||||
|
||||
var origGetUserMedia = md.getUserMedia.bind(md);
|
||||
var wasmPromise = null;
|
||||
var ctxPromise = null; // shared AudioContext + worklet module, created once
|
||||
var wasmPromises = {};
|
||||
var ctxPromise = null;
|
||||
|
||||
function loadWasm() {
|
||||
if (!wasmPromise) {
|
||||
// Non-SIMD build for maximum compatibility — the SIMD wasm has produced
|
||||
// static on some browser/GPU combinations.
|
||||
wasmPromise = fetch(ASSET_BASE + 'rnnoise.wasm').then(function (r) {
|
||||
if (!r.ok) throw new Error('rnnoise wasm fetch failed: ' + r.status);
|
||||
function checkSimd() {
|
||||
try {
|
||||
return WebAssembly.validate(new Uint8Array([0, 97, 115, 109, 1, 0, 0, 0, 1, 5, 1, 96, 0, 1, 123, 3, 2, 1, 0, 10, 10, 1, 8, 0, 65, 0, 253, 15, 253, 98, 11]))
|
||||
? Promise.resolve(true)
|
||||
: Promise.resolve(false);
|
||||
} catch (e) {
|
||||
return Promise.resolve(false);
|
||||
}
|
||||
}
|
||||
|
||||
function loadWasm(modelId) {
|
||||
if (wasmPromises[modelId]) return wasmPromises[modelId];
|
||||
var p = PROCESSORS[modelId];
|
||||
if (!p || !p.wasm) return Promise.resolve(null);
|
||||
|
||||
wasmPromises[modelId] = (modelId === 'rnnoise' ? checkSimd() : Promise.resolve(false)).then(function (simd) {
|
||||
var file = (simd && p.simdWasm) ? p.simdWasm : p.wasm;
|
||||
return fetch(ASSET_BASE + file).then(function (r) {
|
||||
if (!r.ok) {
|
||||
if (simd && p.simdWasm) return fetch(ASSET_BASE + p.wasm).then(function(r2) {
|
||||
if (!r2.ok) throw new Error(modelId + ' wasm failed');
|
||||
return r2.arrayBuffer();
|
||||
});
|
||||
throw new Error(modelId + ' wasm failed');
|
||||
}
|
||||
return r.arrayBuffer();
|
||||
});
|
||||
}
|
||||
return wasmPromise;
|
||||
});
|
||||
return wasmPromises[modelId];
|
||||
}
|
||||
|
||||
function getContext() {
|
||||
if (!ctxPromise) {
|
||||
ctxPromise = (function () {
|
||||
var ctx = new AudioContext({ sampleRate: SAMPLE_RATE });
|
||||
// If the browser ignored our 48 kHz request, RNNoise would receive
|
||||
// wrong-rate data and emit static. Refuse to process in that case.
|
||||
if (ctx.sampleRate !== SAMPLE_RATE) {
|
||||
try {
|
||||
ctx.close();
|
||||
} catch (e) {}
|
||||
return Promise.reject(
|
||||
new Error('AudioContext sampleRate is ' + ctx.sampleRate + ', need ' + SAMPLE_RATE),
|
||||
);
|
||||
try { ctx.close(); } catch (e) {}
|
||||
return Promise.reject(new Error('SampleRate mismatch: ' + ctx.sampleRate));
|
||||
}
|
||||
return ctx.audioWorklet.addModule(ASSET_BASE + 'rnnoiseWorklet.js').then(function () {
|
||||
return ctx.state === 'suspended'
|
||||
? ctx.resume().then(function () {
|
||||
return ctx;
|
||||
})
|
||||
: ctx;
|
||||
// Load required modules
|
||||
var scripts = [PROCESSORS[MODEL].script];
|
||||
if (USE_GATE) scripts.push(PROCESSORS.gate.script);
|
||||
|
||||
return Promise.all(scripts.map(function(s) {
|
||||
return ctx.audioWorklet.addModule(ASSET_BASE + s);
|
||||
})).then(function () {
|
||||
return ctx.state === 'suspended' ? ctx.resume().then(function () { return ctx; }) : ctx;
|
||||
});
|
||||
})();
|
||||
// Don't cache a rejected context forever — allow a later retry.
|
||||
ctxPromise.catch(function () {
|
||||
ctxPromise = null;
|
||||
});
|
||||
ctxPromise.catch(function () { ctxPromise = null; });
|
||||
}
|
||||
return ctxPromise;
|
||||
}
|
||||
|
||||
var hasNotifiedActive = false;
|
||||
|
||||
function processStream(stream) {
|
||||
var audioTracks = stream.getAudioTracks();
|
||||
if (audioTracks.length === 0) return Promise.resolve(stream);
|
||||
|
||||
return Promise.all([loadWasm(), getContext()])
|
||||
return Promise.all([loadWasm(MODEL), getContext()])
|
||||
.then(function (res) {
|
||||
var wasmBinary = res[0];
|
||||
var ctx = res[1];
|
||||
|
||||
var node = new AudioWorkletNode(ctx, PROCESSOR_NAME, {
|
||||
channelCount: 1,
|
||||
channelCountMode: 'explicit',
|
||||
channelInterpretation: 'speakers',
|
||||
numberOfInputs: 1,
|
||||
numberOfOutputs: 1,
|
||||
outputChannelCount: [1],
|
||||
processorOptions: { maxChannels: 1, wasmBinary: wasmBinary },
|
||||
});
|
||||
var source = ctx.createMediaStreamSource(stream);
|
||||
var dest = ctx.createMediaStreamDestination();
|
||||
source.connect(node).connect(dest);
|
||||
var head = source;
|
||||
|
||||
// 1. Optional Noise Gate
|
||||
if (USE_GATE) {
|
||||
var gateNode = new AudioWorkletNode(ctx, PROCESSORS.gate.name, {
|
||||
processorOptions: {
|
||||
openThreshold: GATE_THRESHOLD,
|
||||
closeThreshold: GATE_THRESHOLD - 5,
|
||||
holdMs: 150,
|
||||
maxChannels: 1
|
||||
}
|
||||
});
|
||||
head.connect(gateNode);
|
||||
head = gateNode;
|
||||
}
|
||||
|
||||
// 2. ML Processor
|
||||
var mlOptions = {
|
||||
channelCount: 1,
|
||||
numberOfInputs: 1,
|
||||
numberOfOutputs: 1,
|
||||
processorOptions: { maxChannels: 1 }
|
||||
};
|
||||
|
||||
if (MODEL === 'rnnoise' || MODEL === 'speex') {
|
||||
mlOptions.processorOptions.wasmBinary = wasmBinary;
|
||||
} else if (MODEL === 'dtln') {
|
||||
mlOptions.processorOptions = {
|
||||
wasmUrl: ASSET_BASE + 'litert_wasm_internal.wasm',
|
||||
model1Url: ASSET_BASE + 'model_1.tflite',
|
||||
model2Url: ASSET_BASE + 'model_2.tflite',
|
||||
};
|
||||
} else if (MODEL === 'deepfilternet') {
|
||||
mlOptions.processorOptions = {
|
||||
wasmModule: wasmBinary,
|
||||
modelBytes: new Uint8Array(wasmBinary),
|
||||
suppressionLevel: 50
|
||||
};
|
||||
}
|
||||
|
||||
var mlNode = new AudioWorkletNode(ctx, PROCESSORS[MODEL].name, mlOptions);
|
||||
head.connect(mlNode);
|
||||
mlNode.connect(dest);
|
||||
|
||||
var origTrack = audioTracks[0];
|
||||
var processedTrack = dest.stream.getAudioTracks()[0];
|
||||
@@ -112,44 +186,38 @@
|
||||
function cleanup() {
|
||||
if (torndown) return;
|
||||
torndown = true;
|
||||
try {
|
||||
node.port.postMessage('destroy');
|
||||
} catch (e) {}
|
||||
try {
|
||||
source.disconnect();
|
||||
node.disconnect();
|
||||
} catch (e) {}
|
||||
try {
|
||||
origTrack.stop();
|
||||
} catch (e) {}
|
||||
// Keep the shared AudioContext alive for the next capture.
|
||||
try { mlNode.port.postMessage('destroy'); } catch (e) {}
|
||||
try { source.disconnect(); mlNode.disconnect(); } catch (e) {}
|
||||
try { origTrack.stop(); } catch (e) {}
|
||||
}
|
||||
|
||||
// When EC stops the track we handed it, release the raw capture + graph.
|
||||
var rawStop = processedTrack.stop.bind(processedTrack);
|
||||
processedTrack.stop = function () {
|
||||
cleanup();
|
||||
rawStop();
|
||||
};
|
||||
processedTrack.stop = function () { cleanup(); rawStop(); };
|
||||
origTrack.addEventListener('ended', function () {
|
||||
try {
|
||||
rawStop();
|
||||
} catch (e) {}
|
||||
try { rawStop(); } catch (e) {}
|
||||
cleanup();
|
||||
});
|
||||
|
||||
// Return a stream with the processed audio plus any original video.
|
||||
if (!hasNotifiedActive) {
|
||||
hasNotifiedActive = true;
|
||||
window.parent.postMessage({
|
||||
type: 'lotus-denoise-status',
|
||||
active: true,
|
||||
model: MODEL,
|
||||
nativeNS: USE_NATIVE_NS,
|
||||
gate: USE_GATE
|
||||
}, '*');
|
||||
}
|
||||
|
||||
var out = new MediaStream();
|
||||
out.addTrack(processedTrack);
|
||||
stream.getVideoTracks().forEach(function (t) {
|
||||
out.addTrack(t);
|
||||
});
|
||||
stream.getVideoTracks().forEach(function (t) { out.addTrack(t); });
|
||||
return out;
|
||||
})
|
||||
.catch(function (e) {
|
||||
// Any failure -> fall back to the raw mic so calls never break.
|
||||
// eslint-disable-next-line no-console
|
||||
console.error('[lotus-denoise] RNNoise setup failed, using raw mic', e);
|
||||
var msg = e instanceof Error ? e.message : String(e);
|
||||
console.error('[lotus-denoise] Setup failed:', msg);
|
||||
window.parent.postMessage({ type: 'lotus-denoise-status', active: false, error: msg }, '*');
|
||||
return stream;
|
||||
});
|
||||
}
|
||||
@@ -158,13 +226,9 @@
|
||||
var wantsAudio = !!(constraints && constraints.audio);
|
||||
var effective = constraints;
|
||||
if (wantsAudio) {
|
||||
// RNNoise needs mono 48 kHz; it owns suppression. Keep AEC + AGC on the
|
||||
// raw capture (they run before our processing).
|
||||
var audioC =
|
||||
typeof constraints.audio === 'object' ? Object.assign({}, constraints.audio) : {};
|
||||
audioC.noiseSuppression = false;
|
||||
var audioC = typeof constraints.audio === 'object' ? Object.assign({}, constraints.audio) : {};
|
||||
audioC.noiseSuppression = USE_NATIVE_NS;
|
||||
audioC.channelCount = 1;
|
||||
audioC.sampleRate = SAMPLE_RATE;
|
||||
if (audioC.echoCancellation === undefined) audioC.echoCancellation = true;
|
||||
if (audioC.autoGainControl === undefined) audioC.autoGainControl = true;
|
||||
effective = Object.assign({}, constraints, { audio: audioC });
|
||||
@@ -174,3 +238,4 @@
|
||||
});
|
||||
};
|
||||
})();
|
||||
|
||||
|
||||
@@ -45,6 +45,8 @@
|
||||
"@giphy/js-util": "5.2.0",
|
||||
"@giphy/react-components": "10.1.2",
|
||||
"@sapphi-red/web-noise-suppressor": "0.3.5",
|
||||
"@workadventure/noise-suppression": "1.1.2",
|
||||
"deepfilternet3-noise-filter": "1.2.1",
|
||||
"@sentry/react": "10.53.1",
|
||||
"@tanstack/react-query": "5.100.13",
|
||||
"@tanstack/react-query-devtools": "5.100.13",
|
||||
|
||||
@@ -69,6 +69,7 @@ import { useDateFormatItems } from '../../../hooks/useDateFormat';
|
||||
import { SequenceCardStyle } from '../styles.css';
|
||||
import { useTauriUpdater } from '../../../hooks/useTauriUpdater';
|
||||
import { playCallJoinSound } from '../../../utils/callSounds';
|
||||
import { isMLDenoiseSupported, ML_DENOISE_REQUIREMENTS } from '../../../utils/lotusDenoiseUtils';
|
||||
|
||||
type ThemeSelectorProps = {
|
||||
themeNames: Record<string, string>;
|
||||
@@ -157,7 +158,7 @@ function SelectTheme({ disabled }: { disabled?: boolean }) {
|
||||
);
|
||||
}
|
||||
|
||||
type SettingsSelectOption<T extends string> = { value: T; label: string };
|
||||
type SettingsSelectOption<T extends string> = { value: T; label: string; disabled?: boolean };
|
||||
|
||||
function SettingsSelect<T extends string>({
|
||||
value,
|
||||
@@ -219,7 +220,8 @@ function SettingsSelect<T extends string>({
|
||||
size="300"
|
||||
variant={opt.value === value ? 'Primary' : 'Surface'}
|
||||
radii="300"
|
||||
onClick={() => handleSelect(opt.value)}
|
||||
disabled={opt.disabled}
|
||||
onClick={() => !opt.disabled && handleSelect(opt.value)}
|
||||
>
|
||||
<Text size="T300">{opt.label}</Text>
|
||||
</MenuItem>
|
||||
@@ -1196,12 +1198,114 @@ function useKeyBind(setter: (code: string) => void) {
|
||||
const keyLabel = (code: string) =>
|
||||
code === 'Space' ? 'Space' : code.replace('Key', '').replace('Digit', '');
|
||||
|
||||
import {
|
||||
DENOISE_MODELS,
|
||||
isMLDenoiseSupported,
|
||||
ML_DENOISE_REQUIREMENTS,
|
||||
} from '../../../utils/lotusDenoiseUtils';
|
||||
|
||||
function MicMeter() {
|
||||
const [level, setLevel] = useState(0);
|
||||
const [active, setActive] = useState(false);
|
||||
const streamRef = useRef<MediaStream | null>(null);
|
||||
const ctxRef = useRef<AudioContext | null>(null);
|
||||
const rafRef = useRef<number | null>(null);
|
||||
|
||||
const stop = useCallback(() => {
|
||||
if (rafRef.current !== null) cancelAnimationFrame(rafRef.current);
|
||||
rafRef.current = null;
|
||||
streamRef.current?.getTracks().forEach((t) => t.stop());
|
||||
streamRef.current = null;
|
||||
ctxRef.current?.close();
|
||||
ctxRef.current = null;
|
||||
setActive(false);
|
||||
setLevel(0);
|
||||
}, []);
|
||||
|
||||
const start = async () => {
|
||||
try {
|
||||
const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
|
||||
streamRef.current = stream;
|
||||
const ctx = new AudioContext();
|
||||
ctxRef.current = ctx;
|
||||
const source = ctx.createMediaStreamSource(stream);
|
||||
const analyser = ctx.createAnalyser();
|
||||
analyser.fftSize = 256;
|
||||
source.connect(analyser);
|
||||
|
||||
const buffer = new Uint8Array(analyser.frequencyBinCount);
|
||||
const update = () => {
|
||||
analyser.getByteFrequencyData(buffer);
|
||||
let sum = 0;
|
||||
for (let i = 0; i < buffer.length; i += 1) sum += buffer[i];
|
||||
setLevel(sum / buffer.length);
|
||||
rafRef.current = requestAnimationFrame(update);
|
||||
};
|
||||
update();
|
||||
setActive(true);
|
||||
} catch (e) {
|
||||
// eslint-disable-next-line no-console
|
||||
console.error('Mic test failed', e);
|
||||
}
|
||||
};
|
||||
|
||||
useEffect(() => () => stop(), [stop]);
|
||||
|
||||
return (
|
||||
<Box direction="Column" gap="100" style={{ padding: '8px 0' }}>
|
||||
<Box direction="Row" gap="200" align="Center">
|
||||
<Button size="300" variant="Secondary" outlined onClick={active ? stop : start}>
|
||||
<Text size="T300">{active ? 'Stop Test' : 'Test Microphone'}</Text>
|
||||
</Button>
|
||||
<Box
|
||||
grow="Yes"
|
||||
style={{
|
||||
height: '10px',
|
||||
background: 'var(--lt-bg-card, rgba(0,0,0,0.2))',
|
||||
borderRadius: '5px',
|
||||
overflow: 'hidden',
|
||||
position: 'relative',
|
||||
border: '1px solid var(--lt-border-color)',
|
||||
}}
|
||||
>
|
||||
<Box
|
||||
style={{
|
||||
position: 'absolute',
|
||||
top: 0,
|
||||
left: 0,
|
||||
bottom: 0,
|
||||
width: `${Math.min(100, (level / 128) * 100)}%`,
|
||||
background: 'var(--lt-accent-green, #00FF88)',
|
||||
transition: 'width 0.05s linear',
|
||||
boxShadow: '0 0 8px var(--lt-accent-green)',
|
||||
}}
|
||||
/>
|
||||
</Box>
|
||||
</Box>
|
||||
<Text size="S300" variant="Secondary">
|
||||
The green bar shows your live volume. Use this to tune the Gate Threshold.
|
||||
</Text>
|
||||
</Box>
|
||||
);
|
||||
}
|
||||
|
||||
function Calls() {
|
||||
const [cameraOnJoin, setCameraOnJoin] = useSetting(settingsAtom, 'cameraOnJoin');
|
||||
const [callNoiseSuppression, setCallNoiseSuppression] = useSetting(
|
||||
settingsAtom,
|
||||
'callNoiseSuppression',
|
||||
);
|
||||
const [callDenoiseModel, setCallDenoiseModel] = useSetting(settingsAtom, 'callDenoiseModel');
|
||||
const [callDenoiseNativeNS, setCallDenoiseNativeNS] = useSetting(
|
||||
settingsAtom,
|
||||
'callDenoiseNativeNS',
|
||||
);
|
||||
const [callDenoiseGate, setCallDenoiseGate] = useSetting(settingsAtom, 'callDenoiseGate');
|
||||
const [callDenoiseGateThreshold, setCallDenoiseGateThreshold] = useSetting(
|
||||
settingsAtom,
|
||||
'callDenoiseGateThreshold',
|
||||
);
|
||||
|
||||
const [pttMode, setPttMode] = useSetting(settingsAtom, 'pttMode');
|
||||
const [pttKey, setPttKey] = useSetting(settingsAtom, 'pttKey');
|
||||
const [deafenKey, setDeafenKey] = useSetting(settingsAtom, 'deafenKey');
|
||||
@@ -1220,6 +1324,8 @@ function Calls() {
|
||||
const pttBind = useKeyBind(setPttKey);
|
||||
const deafenBind = useKeyBind(setDeafenKey);
|
||||
|
||||
const mlSupported = isMLDenoiseSupported();
|
||||
|
||||
return (
|
||||
<Box direction="Column" gap="100">
|
||||
<Text size="L400">Calls</Text>
|
||||
@@ -1233,7 +1339,79 @@ function Calls() {
|
||||
<SequenceCard className={SequenceCardStyle} variant="SurfaceVariant" direction="Column">
|
||||
<SettingTile
|
||||
title="Noise Suppression"
|
||||
description="Filter background noise from your mic during calls. Browser-native uses the built-in WebRTC suppressor; ML runs on-device RNNoise for stronger, Krisp-style removal (higher CPU)."
|
||||
description={
|
||||
<Box direction="Column" gap="200">
|
||||
<Text>
|
||||
Filter background noise from your mic during calls. Browser-native uses the
|
||||
built-in WebRTC suppressor (Google NSNet2).
|
||||
</Text>
|
||||
|
||||
<Box direction="Column" gap="100" style={{ overflowX: 'auto' }}>
|
||||
<Box
|
||||
direction="Row"
|
||||
gap="100"
|
||||
style={{ borderBottom: '1px solid var(--lt-border-color)', paddingBottom: '4px' }}
|
||||
>
|
||||
<Box style={{ width: '120px' }}>
|
||||
<Text size="S300" bold>
|
||||
Model
|
||||
</Text>
|
||||
</Box>
|
||||
<Box style={{ width: '80px' }}>
|
||||
<Text size="S300" bold>
|
||||
CPU
|
||||
</Text>
|
||||
</Box>
|
||||
<Box style={{ width: '80px' }}>
|
||||
<Text size="S300" bold>
|
||||
Quality
|
||||
</Text>
|
||||
</Box>
|
||||
<Box grow="Yes">
|
||||
<Text size="S300" bold>
|
||||
Transients
|
||||
</Text>
|
||||
</Box>
|
||||
</Box>
|
||||
{DENOISE_MODELS.map((model) => (
|
||||
<Box key={model.id} direction="Row" gap="100">
|
||||
<Box style={{ width: '120px' }}>
|
||||
<Text size="S300">{model.name}</Text>
|
||||
</Box>
|
||||
<Box style={{ width: '80px' }}>
|
||||
<Text size="S300">{model.cpuUsage}</Text>
|
||||
</Box>
|
||||
<Box style={{ width: '80px' }}>
|
||||
<Text size="S300">{model.voiceQuality}</Text>
|
||||
</Box>
|
||||
<Box grow="Yes">
|
||||
<Text size="S300">{model.transients}</Text>
|
||||
</Box>
|
||||
</Box>
|
||||
))}
|
||||
</Box>
|
||||
|
||||
{!mlSupported && (
|
||||
<Box direction="Column" gap="100">
|
||||
<Text variant="Warning" size="S300">
|
||||
ML options are not supported in this browser.
|
||||
</Text>
|
||||
<Box as="ul" style={{ paddingLeft: '20px', margin: 0 }}>
|
||||
{ML_DENOISE_REQUIREMENTS.map((req) => (
|
||||
<Text as="li" key={req} size="S300">
|
||||
{req}
|
||||
</Text>
|
||||
))}
|
||||
</Box>
|
||||
</Box>
|
||||
)}
|
||||
{callNoiseSuppression === 'ml' && (
|
||||
<Text variant="Warning" size="S300">
|
||||
Note: Applying changes requires rejoining the call.
|
||||
</Text>
|
||||
)}
|
||||
</Box>
|
||||
}
|
||||
after={
|
||||
<SettingsSelect<NoiseSuppressionMode>
|
||||
value={callNoiseSuppression}
|
||||
@@ -1241,11 +1419,86 @@ function Calls() {
|
||||
options={[
|
||||
{ value: 'off', label: 'Off' },
|
||||
{ value: 'browser', label: 'Browser-native' },
|
||||
{ value: 'ml', label: 'ML (beta)' },
|
||||
{
|
||||
value: 'ml',
|
||||
label: 'ML (Advanced)',
|
||||
disabled: !mlSupported,
|
||||
},
|
||||
]}
|
||||
/>
|
||||
}
|
||||
/>
|
||||
|
||||
{callNoiseSuppression === 'ml' && (
|
||||
<Box
|
||||
direction="Column"
|
||||
gap="300"
|
||||
style={{
|
||||
padding: '16px',
|
||||
marginTop: '8px',
|
||||
borderTop: '1px solid var(--lt-border-color)',
|
||||
background: 'rgba(0,0,0,0.1)',
|
||||
}}
|
||||
>
|
||||
<SettingTile
|
||||
title="ML Model"
|
||||
description="Choose the machine learning model to use for noise removal."
|
||||
after={
|
||||
<SettingsSelect<DenoiseModelId>
|
||||
value={callDenoiseModel}
|
||||
onChange={setCallDenoiseModel}
|
||||
options={[
|
||||
{ value: 'rnnoise', label: 'RNNoise' },
|
||||
{ value: 'speex', label: 'Speex (Legacy)' },
|
||||
{ value: 'dtln', label: 'DTLN (Balanced)' },
|
||||
{ value: 'deepfilternet', label: 'DeepFilterNet 3 (Pro)' },
|
||||
]}
|
||||
/>
|
||||
}
|
||||
/>
|
||||
|
||||
<SettingTile
|
||||
title="Series Suppression"
|
||||
description="Run the browser's native stationary noise filter before the ML model. Recommended for eliminating fan hum."
|
||||
after={
|
||||
<Switch
|
||||
variant="Primary"
|
||||
value={callDenoiseNativeNS}
|
||||
onChange={setCallDenoiseNativeNS}
|
||||
/>
|
||||
}
|
||||
/>
|
||||
|
||||
<SettingTile
|
||||
title="Noise Gate"
|
||||
description="Hard-cut audio when you aren't speaking to ensure absolute silence between sentences."
|
||||
after={
|
||||
<Switch variant="Primary" value={callDenoiseGate} onChange={setCallDenoiseGate} />
|
||||
}
|
||||
/>
|
||||
|
||||
{callDenoiseGate && (
|
||||
<Box direction="Column" gap="100">
|
||||
<Box direction="Row" justify="SpaceBetween">
|
||||
<Text size="S300">Gate Threshold</Text>
|
||||
<Text size="S300" bold>
|
||||
{callDenoiseGateThreshold} dB
|
||||
</Text>
|
||||
</Box>
|
||||
<input
|
||||
type="range"
|
||||
min="-100"
|
||||
max="0"
|
||||
step="1"
|
||||
value={callDenoiseGateThreshold}
|
||||
onChange={(e) => setCallDenoiseGateThreshold(parseInt(e.target.value, 10))}
|
||||
style={{ width: '100%', accentColor: 'var(--lt-accent-orange)' }}
|
||||
/>
|
||||
<MicMeter />
|
||||
</Box>
|
||||
)}
|
||||
</Box>
|
||||
)}
|
||||
</SequenceCard>
|
||||
<SequenceCard
|
||||
className={SequenceCardStyle}
|
||||
|
||||
@@ -46,6 +46,10 @@ export const createCallEmbed = (
|
||||
container: HTMLElement,
|
||||
pref?: CallPreferences,
|
||||
denoiseMode: NoiseSuppressionMode = 'browser',
|
||||
denoiseModel: string = 'rnnoise',
|
||||
denoiseNativeNS: boolean = true,
|
||||
denoiseGate: boolean = false,
|
||||
denoiseGateThreshold: number = -45,
|
||||
forceAudioOff = false,
|
||||
): CallEmbed => {
|
||||
const rtcSession = mx.matrixRTC.getRoomSession(room);
|
||||
@@ -60,6 +64,10 @@ export const createCallEmbed = (
|
||||
intent,
|
||||
themeKind,
|
||||
denoiseMode,
|
||||
denoiseModel,
|
||||
denoiseNativeNS,
|
||||
denoiseGate,
|
||||
denoiseGateThreshold,
|
||||
initialAudio,
|
||||
initialVideo,
|
||||
);
|
||||
@@ -77,6 +85,10 @@ export const useCallStart = (dm = false) => {
|
||||
const setCallEmbed = useSetAtom(callEmbedAtom);
|
||||
const callEmbedRef = useCallEmbedRef();
|
||||
const [callNoiseSuppression] = useSetting(settingsAtom, 'callNoiseSuppression');
|
||||
const [callDenoiseModel] = useSetting(settingsAtom, 'callDenoiseModel');
|
||||
const [callDenoiseNativeNS] = useSetting(settingsAtom, 'callDenoiseNativeNS');
|
||||
const [callDenoiseGate] = useSetting(settingsAtom, 'callDenoiseGate');
|
||||
const [callDenoiseGateThreshold] = useSetting(settingsAtom, 'callDenoiseGateThreshold');
|
||||
const [pttMode] = useSetting(settingsAtom, 'pttMode');
|
||||
|
||||
const startCall = useCallback(
|
||||
@@ -97,12 +109,28 @@ export const useCallStart = (dm = false) => {
|
||||
container,
|
||||
pref,
|
||||
callNoiseSuppression ?? 'browser',
|
||||
callDenoiseModel ?? 'rnnoise',
|
||||
callDenoiseNativeNS ?? true,
|
||||
callDenoiseGate ?? false,
|
||||
callDenoiseGateThreshold ?? -45,
|
||||
!!pttMode,
|
||||
);
|
||||
|
||||
setCallEmbed(callEmbed);
|
||||
},
|
||||
[mx, dm, theme, setCallEmbed, callEmbedRef, callNoiseSuppression, pttMode],
|
||||
[
|
||||
mx,
|
||||
dm,
|
||||
theme,
|
||||
setCallEmbed,
|
||||
callEmbedRef,
|
||||
callNoiseSuppression,
|
||||
callDenoiseModel,
|
||||
callDenoiseNativeNS,
|
||||
callDenoiseGate,
|
||||
callDenoiseGateThreshold,
|
||||
pttMode,
|
||||
],
|
||||
);
|
||||
|
||||
return startCall;
|
||||
|
||||
@@ -382,6 +382,32 @@ function DeepLinkNavigator() {
|
||||
return null;
|
||||
}
|
||||
|
||||
function LotusDenoiseFeature() {
|
||||
const setToast = useSetAtom(toastQueueAtom);
|
||||
|
||||
useEffect(() => {
|
||||
const handleMessage = (event: MessageEvent) => {
|
||||
if (event.data?.type === 'lotus-denoise-status') {
|
||||
const { active, error } = event.data;
|
||||
if (!active) {
|
||||
setToast({
|
||||
id: `denoise-fail-${Date.now()}`,
|
||||
displayName: 'Audio Quality',
|
||||
body: `ML Noise Suppression failed: ${error || 'Unknown error'}. Falling back to raw mic.`,
|
||||
roomName: 'System',
|
||||
roomId: '',
|
||||
});
|
||||
}
|
||||
}
|
||||
};
|
||||
|
||||
window.addEventListener('message', handleMessage);
|
||||
return () => window.removeEventListener('message', handleMessage);
|
||||
}, [setToast]);
|
||||
|
||||
return null;
|
||||
}
|
||||
|
||||
export function ClientNonUIFeatures({ children }: ClientNonUIFeaturesProps) {
|
||||
return (
|
||||
<>
|
||||
@@ -391,6 +417,7 @@ export function ClientNonUIFeatures({ children }: ClientNonUIFeaturesProps) {
|
||||
<PresenceUpdater />
|
||||
<InviteNotifications />
|
||||
<MessageNotifications />
|
||||
<LotusDenoiseFeature />
|
||||
<DeepLinkNavigator />
|
||||
{children}
|
||||
</>
|
||||
|
||||
@@ -102,6 +102,10 @@ export class CallEmbed {
|
||||
intent: ElementCallIntent,
|
||||
themeKind: ElementCallThemeKind,
|
||||
denoiseMode: NoiseSuppressionMode = 'browser',
|
||||
denoiseModel: string = 'rnnoise',
|
||||
denoiseNativeNS: boolean = true,
|
||||
denoiseGate: boolean = false,
|
||||
denoiseGateThreshold: number = -45,
|
||||
initialAudio = true,
|
||||
initialVideo = false,
|
||||
): Widget {
|
||||
@@ -126,8 +130,8 @@ export class CallEmbed {
|
||||
lang: 'en-EN',
|
||||
theme: themeKind,
|
||||
// EC's built-in WebRTC suppressor: on only for 'browser' tier. For 'ml' we
|
||||
// disable it here so RNNoise (the Lotus denoise shim) owns suppression and
|
||||
// the two don't fight each other.
|
||||
// disable it here so EC doesn't do its own extra processing, and let the
|
||||
// Lotus denoise shim (which keeps native NS on) handle the pipeline.
|
||||
noiseSuppression: (denoiseMode === 'browser').toString(),
|
||||
audio: initialAudio.toString(),
|
||||
video: initialVideo.toString(),
|
||||
@@ -135,9 +139,12 @@ export class CallEmbed {
|
||||
});
|
||||
|
||||
if (denoiseMode === 'ml') {
|
||||
// Signal the Lotus denoise shim (injected into the EC index.html) to route
|
||||
// the mic through the RNNoise worklet before LiveKit publishes the track.
|
||||
// Signal the Lotus denoise shim to route the mic through the ML processors.
|
||||
params.append('lotusDenoise', 'ml');
|
||||
params.append('lotusModel', denoiseModel);
|
||||
params.append('lotusNativeNS', denoiseNativeNS.toString());
|
||||
params.append('lotusGate', denoiseGate.toString());
|
||||
params.append('lotusGateThreshold', denoiseGateThreshold.toString());
|
||||
}
|
||||
|
||||
if (CallEmbed.startingCall(intent)) {
|
||||
|
||||
@@ -14,6 +14,7 @@ export type MessageSpacing = '0' | '100' | '200' | '300' | '400' | '500';
|
||||
// - 'browser' : WebRTC built-in suppression (Element Call noiseSuppression param)
|
||||
// - 'ml' : client-side RNNoise ML suppression (Lotus denoise shim)
|
||||
export type NoiseSuppressionMode = 'off' | 'browser' | 'ml';
|
||||
export type DenoiseModelId = 'rnnoise' | 'speex' | 'dtln' | 'deepfilternet';
|
||||
export type ChatBackground =
|
||||
| 'none'
|
||||
| 'blueprint'
|
||||
@@ -115,6 +116,10 @@ export interface Settings {
|
||||
|
||||
cameraOnJoin: boolean;
|
||||
callNoiseSuppression: NoiseSuppressionMode;
|
||||
callDenoiseModel: DenoiseModelId;
|
||||
callDenoiseNativeNS: boolean;
|
||||
callDenoiseGate: boolean;
|
||||
callDenoiseGateThreshold: number;
|
||||
pttMode: boolean;
|
||||
pttKey: string;
|
||||
|
||||
@@ -205,6 +210,10 @@ const defaultSettings: Settings = {
|
||||
|
||||
cameraOnJoin: false,
|
||||
callNoiseSuppression: 'browser',
|
||||
callDenoiseModel: 'rnnoise',
|
||||
callDenoiseNativeNS: true,
|
||||
callDenoiseGate: false,
|
||||
callDenoiseGateThreshold: -45,
|
||||
pttMode: false,
|
||||
pttKey: 'Space',
|
||||
|
||||
|
||||
@@ -0,0 +1,68 @@
|
||||
/**
|
||||
* Detection utilities for Lotus ML noise suppression (RNNoise).
|
||||
*/
|
||||
|
||||
export type DenoiseModel = {
|
||||
id: string;
|
||||
name: string;
|
||||
description: string;
|
||||
cpuUsage: string;
|
||||
binarySize: string;
|
||||
transients: 'Poor' | 'Good' | 'Excellent';
|
||||
voiceQuality: 'Moderate' | 'High' | 'Very High';
|
||||
};
|
||||
|
||||
export const DENOISE_MODELS: DenoiseModel[] = [
|
||||
{
|
||||
id: 'rnnoise',
|
||||
name: 'RNNoise (Mozilla)',
|
||||
description: 'Lightweight hybrid model. Best for consistent noise like fans.',
|
||||
cpuUsage: '< 5%',
|
||||
binarySize: '< 1 MB',
|
||||
transients: 'Poor',
|
||||
voiceQuality: 'Moderate',
|
||||
},
|
||||
{
|
||||
id: 'dtln',
|
||||
name: 'DTLN (Balanced)',
|
||||
description: 'Deep learning model with a good balance of quality and CPU.',
|
||||
cpuUsage: '10-20%',
|
||||
binarySize: '3-4 MB',
|
||||
transients: 'Good',
|
||||
voiceQuality: 'High',
|
||||
},
|
||||
{
|
||||
id: 'deepfilternet',
|
||||
name: 'DeepFilterNet 3 (Pro)',
|
||||
description: 'State-of-the-art studio quality. Removes all background noise.',
|
||||
cpuUsage: '25-50%+',
|
||||
binarySize: '15-20 MB',
|
||||
transients: 'Excellent',
|
||||
voiceQuality: 'Very High',
|
||||
},
|
||||
];
|
||||
|
||||
export const isMLDenoiseSupported = (): boolean => {
|
||||
if (typeof window === 'undefined') return false;
|
||||
|
||||
// Requirements:
|
||||
// 1. AudioContext/webkitAudioContext (Web Audio API)
|
||||
// 2. AudioWorklet (Real-time processing in a background thread)
|
||||
// 3. getUserMedia (Microphone access)
|
||||
const hasAudioContext = !!(window.AudioContext || (window as any).webkitAudioContext);
|
||||
const hasAudioWorklet = hasAudioContext && !!AudioWorkletNode;
|
||||
const hasGetUserMedia = !!(navigator.mediaDevices && navigator.mediaDevices.getUserMedia);
|
||||
|
||||
return hasAudioWorklet && hasGetUserMedia;
|
||||
};
|
||||
|
||||
/**
|
||||
* EXACT requirements for ML Denoise (for UI display).
|
||||
*/
|
||||
export const ML_DENOISE_REQUIREMENTS = [
|
||||
'Modern browser with Web Audio API support',
|
||||
'AudioWorklet support (Chrome 66+, Firefox 76+, Safari 14.1+)',
|
||||
'Microphone access',
|
||||
'48kHz AudioContext capability',
|
||||
];
|
||||
|
||||
+32
-1
@@ -87,9 +87,40 @@ function lotusDenoise() {
|
||||
],
|
||||
[path.join(sapphi, 'rnnoise.wasm'), path.join(denoiseDir, 'rnnoise.wasm')],
|
||||
[path.join(sapphi, 'rnnoise_simd.wasm'), path.join(denoiseDir, 'rnnoise_simd.wasm')],
|
||||
[
|
||||
path.join(sapphi, 'speex/workletProcessor.js'),
|
||||
path.join(denoiseDir, 'speexWorklet.js'),
|
||||
],
|
||||
[path.join(sapphi, 'speex.wasm'), path.join(denoiseDir, 'speex.wasm')],
|
||||
[
|
||||
path.join(sapphi, 'noiseGate/workletProcessor.js'),
|
||||
path.join(denoiseDir, 'noiseGateWorklet.js'),
|
||||
],
|
||||
// DTLN (WorkAdventure LiteRT implementation)
|
||||
[
|
||||
path.resolve('node_modules/@workadventure/noise-suppression/dist/noise-suppression-processor.js'),
|
||||
path.join(denoiseDir, 'dtlnWorklet.js'),
|
||||
],
|
||||
[
|
||||
path.resolve('node_modules/@workadventure/noise-suppression/dist/litert_wasm_internal.wasm'),
|
||||
path.join(denoiseDir, 'litert_wasm_internal.wasm'),
|
||||
],
|
||||
[
|
||||
path.resolve('node_modules/@workadventure/noise-suppression/dist/model_1.tflite'),
|
||||
path.join(denoiseDir, 'model_1.tflite'),
|
||||
],
|
||||
[
|
||||
path.resolve('node_modules/@workadventure/noise-suppression/dist/model_2.tflite'),
|
||||
path.join(denoiseDir, 'model_2.tflite'),
|
||||
],
|
||||
];
|
||||
assets.forEach(([s, d]) => {
|
||||
if (fs.existsSync(s)) fs.copyFileSync(s, d);
|
||||
if (fs.existsSync(s)) {
|
||||
fs.copyFileSync(s, d);
|
||||
} else {
|
||||
// eslint-disable-next-line no-console
|
||||
console.warn(`[lotus-denoise] Asset missing, will be populated by CI: ${s}`);
|
||||
}
|
||||
});
|
||||
|
||||
const shimSrc = path.resolve('build/lotus-denoise.js');
|
||||
|
||||
Reference in New Issue
Block a user