13 lines
1.0 KiB
Markdown
13 lines
1.0 KiB
Markdown
|
|
# Ideas / Future Improvements
|
||
|
|
|
||
|
|
## nnnoiseless (RNNoise) Denoiser Improvements
|
||
|
|
|
||
|
|
### VAD-gated passthrough
|
||
|
|
`process_frame()` returns an `f32` VAD probability (0.0-1.0). Currently ignored. Use it to skip denoising when VAD is low — prevents the model from suppressing non-speech audio (hold music, DTMF tones, IVR prompts).
|
||
|
|
|
||
|
|
### Pre-warm denoiser on session creation
|
||
|
|
The first `process_frame()` call on a fresh `DenoiseState` produces fade-in artifacts (documented behavior). Feed a silent 480-sample frame during `TranscodeState::new()` so the first real audio frame gets a warmed-up RNN state.
|
||
|
|
|
||
|
|
### Custom telephony-trained RNNoise model
|
||
|
|
nnnoiseless supports loading custom `.rnn` model files via `RnnModel::from_bytes()` / `RnnModel::from_static_bytes()`. The default model is trained on general audio. A model trained specifically on telephony noise profiles (codec artifacts, line noise, echo residual) would perform better. Models from https://github.com/GregorR/rnnoise-models can be converted with `train/convert_rnnoise.py`.
|