LotusGuild/cinny

Fork 0

Files

T

jared 6634b2b8a2

CI / Build & Quality Checks (push) Successful in 10m41s

Details

CI / Trigger Desktop Build (push) Successful in 6s

Details

fix(calls): make ML denoise build-honest + gate desktop trigger on CI

Audit/repair of the multi-model denoise work so it actually builds and only
exposes working, self-hosted models.

- Complete the DTLN/DFN3 revert: uninstall @workadventure/noise-suppression
  and deepfilternet3-noise-filter (package.json + lockfile), drop the unused
  DTLN asset-copy block from vite.config.js (was shipping ~2MB of unused
  tflite/wasm), and narrow DenoiseModelId to the bundled models (rnnoise,
  speex). Coerce any retired persisted model value back to the default.
- Fix General.tsx CI typecheck failures introduced by the denoise UI: restore
  three imports the rewrite deleted (useDateFormatItems, SequenceCardStyle,
  useTauriUpdater), add the missing denoise/sound imports, and correct
  hallucinated Folds props (Text has no variant/bold; Box uses
  alignItems/justifyContent). tsc now passes with 0 errors.
- Harden the vite denoise plugin: required RNNoise/Speex/gate assets and the
  shim now fail the build loudly if missing (instead of a silent warn that
  shipped a broken ML feature), and the index.html shim injection is verified.
- CI: move the cinny-desktop submodule bump into ci.yml as a `trigger-desktop`
  job gated on `needs: build`, and delete the standalone trigger-desktop.yml.
  A failing push no longer kicks off the slow Tauri builds in parallel.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-06-16 01:42:21 -04:00

4.6 KiB

Raw Blame History

Engineering Review: Multi-Model ML Noise Suppression Upgrade (P5-30)

Overview

This PR implements a robust, modular, and high-fidelity client-side audio processing pipeline for noise suppression (NS) within Lotus Chat. It addresses issues with static noise artifacts, suboptimal sample rate resampling, and the lack of transparency in the audio processing chain.

1. Architectural Changes

1.1 Audio Processing Pipeline (`lotus-denoise.js`)

Decoupled Initialization: The shim now treats the audio chain as a configurable graph: Source → Noise Gate (optional) → ML Model → LiveKit.
Series Processing: We enabled the browser-native suppressor (Google NSNet2) to run in series with the ML model. The native engine handles stationary noise (fan hum) efficiently, while the ML model focuses on transient "life" noise (keyboard clicks, mouse taps).
Hardware Fidelity: Removed forced 48kHz capture constraints in getUserMedia. This allows high-end audio interfaces (e.g., Rode/Scarlett at 48kHz) to pass raw audio without low-quality browser-level resampling, which was previously creating "static" artifacts.
SIMD Optimization: Added runtime WebAssembly.validate checks to detect SIMD support. The pipeline dynamically selects rnnoise_simd.wasm over standard WASM if supported, reducing CPU utilization.
Failure Resilience: Wrapped the entire graph initialization in Promise.all + try/catch. If any component (WASM loading, AudioWorklet initialization) fails, the shim sends a postMessage failure report and falls back to the raw microphone stream, ensuring calls never drop due to suppression errors.

1.2 Multi-Model Support

Added support for 4 distinct processing models:

RNNoise (Mozilla): Default lightweight hybrid model.
Speex (Legacy): DSP-based fallback for extremely low-CPU requirements.
DTLN (Balanced): Deep learning model (~15% CPU). Improved transient handling.
DeepFilterNet 3 (Pro): Studio-grade Deep Learning (~25-50%+ CPU). Designed for high-fidelity noise removal.

2. Infrastructure & Build Integration (`vite.config.js`)

Automated Asset Pipeline: Added rules to copy model assets (TFLite models, WASM runtimes) from node_modules into the denoise/ directory during build.
CI-Friendly: The copy logic now includes console.warn fallbacks for missing assets to prevent build failures in environments where npm install hasn't yet finished, facilitating robust CI/CD integration.
Self-Hosting: All assets are explicitly served from the /denoise/ path, ensuring full privacy and avoiding external CDN dependencies at runtime.

3. UI & UX Improvements

3.1 Settings & Tuning (`General.tsx`)

Capability Detection: Created lotusDenoiseUtils.ts to verify support for AudioContext and AudioWorklet. The ML option is programmatically disabled in unsupported browsers (e.g., Safari/Mobile) with a clear requirement list.
Comparison Chart: Added a UI table listing Model, CPU Usage, Quality, and Transient Handling to allow users to make informed decisions based on their hardware.
Live Tuning: Added a MicMeter component using an AnalyserNode to provide real-time visual feedback, enabling users to calibrate the Noise Gate Threshold (-100dB to 0dB) precisely to their microphone's noise floor.

3.2 Error Reporting

Inter-Iframe Comms: The shim now reports status and failures to the parent LotusChat host via window.parent.postMessage.
System Toasts: Added LotusDenoiseFeature in ClientNonUIFeatures.tsx. It listens for these events and triggers a non-intrusive system toast if the noise suppression falls back to raw mic, ensuring users know their microphone status.

4. Technical Debt & Safety

Settings Persistence: Added strongly-typed settings fields for callDenoiseModel, callDenoiseNativeNS, callDenoiseGate, and callDenoiseGateThreshold to settings.ts.
Clean Teardown: Improved cleanup() logic in lotus-denoise.js to ensure the AudioContext and MediaStreamTracks are properly released, preventing potential memory leaks or microphone "hanging" after calls.

Testing Instructions for Senior Engineer

Calibration: Go to Settings, enable ML NS, toggle on Noise Gate, and click "Test Microphone". Confirm the meter reflects real-time audio.
Validation: Test "Series Suppression ON" vs "OFF" with a fan running in the background to confirm native NS is effectively handling the stationary noise.
Fallback Test: Introduce a malformed model request (via devtools console) to verify the System Toast notification functions.

4.6 KiB Raw Blame History