@serve.zone/siprouter
A production-grade SIP B2BUA + WebRTC bridge built with TypeScript and Rust. Routes calls between SIP providers, SIP hardware devices, and browser softphones — with real-time codec transcoding, adaptive jitter buffering, ML noise suppression, neural TTS, voicemail, IVR menus, and a slick web dashboard.
Issue Reporting and Security
For reporting bugs, issues, or security vulnerabilities, please visit community.foss.global/. This is the central community hub for all issue reporting. Developers who sign and comply with our contribution agreement and go through identification can also get a code.foss.global/ account to submit Pull Requests directly.
🔥 What It Does
siprouter sits between your SIP trunk providers and your endpoints — hardware phones, ATAs, browser softphones — and handles everything in between:
- 📞 SIP B2BUA — Terminates and re-originates calls with full RFC 3261 dialog state management, digest auth, and SDP negotiation
- 🌐 WebRTC Bridge — Browser-based softphone with bidirectional Opus audio to the SIP network
- 🎛️ Multi-Provider Trunking — Register with multiple SIP providers simultaneously (sipgate, easybell, etc.) with automatic failover
- 🎧 48kHz f32 Audio Engine — High-fidelity internal audio bus at 48kHz/32-bit float with native Opus float encode/decode, FFT-based resampling, and per-leg ML noise suppression
- 🔀 N-Leg Mix-Minus Mixer — Conference-grade mixing with dynamic leg add/remove, transfer, and per-source audio separation
- 🎯 Adaptive Jitter Buffer — Per-leg jitter buffering with sequence-based reordering, adaptive depth (60–120ms), Opus PLC for lost packets, and hold/resume detection
- 📧 Voicemail — Configurable voicemail boxes with TTS greetings, recording, and web playback
- 🔢 IVR Menus — DTMF-navigable interactive voice response with nested menus, routing actions, and custom prompts
- 🗣️ Neural TTS — Kokoro-powered announcements and greetings with 25+ voice presets, backed by espeak-ng fallback
- 🎙️ Call Recording — Per-source separated WAV recording at 48kHz via tool legs
- 🖥️ Web Dashboard — Real-time SPA with 9 views: live calls, browser phone, routing, voicemail, IVR, contacts, providers, and streaming logs
🏗️ Architecture
┌─────────────────────────────────────┐
│ Browser Softphone │
│ (WebRTC via WebSocket signaling) │
└──────────────┬──────────────────────┘
│ Opus/WebRTC
▼
┌──────────────────────────────────────┐
│ siprouter │
│ │
│ TypeScript Control Plane │
│ ┌────────────────────────────────┐ │
│ │ Config · WebRTC Signaling │ │
│ │ REST API · Web Dashboard │ │
│ │ Voicebox Manager · TTS Cache │ │
│ └────────────┬───────────────────┘ │
│ JSON-over-stdio IPC │
│ ┌────────────┴───────────────────┐ │
│ │ Rust proxy-engine (data plane) │ │
│ │ │ │
│ │ SIP Stack · Dialog SM · Auth │ │
│ │ Call Manager · N-Leg Mixer │ │
│ │ 48kHz f32 Bus · Jitter Buffer │ │
│ │ Codec Engine · RTP Port Pool │ │
│ │ WebRTC Engine · Kokoro TTS │ │
│ │ Voicemail · IVR · Recording │ │
│ └────┬──────────────────┬────────┘ │
└───────┤──────────────────┤───────────┘
│ │
┌──────┴──────┐ ┌──────┴──────┐
│ SIP Devices │ │ SIP Trunk │
│ (HT801 etc) │ │ Providers │
└─────────────┘ └─────────────┘
🧠 Key Design Decisions
- Hub Model — Every call is a hub with N legs. Each leg is a
SipLeg(device/provider) orWebRtcLeg(browser). Legs can be dynamically added, removed, or transferred without tearing down the call. - Rust Data Plane — All SIP protocol handling, codec transcoding, mixing, and RTP I/O runs in native Rust for real-time performance. TypeScript handles config, signaling, REST API, and dashboard.
- 48kHz f32 Internal Bus — Audio is processed at maximum quality internally. Encoding/decoding to wire format (G.722, PCMU, Opus) happens solely at the leg boundary.
- Per-Session Codec Isolation — Each call leg gets its own encoder/decoder/resampler/denoiser state — no cross-call corruption.
- SDP Codec Negotiation — Outbound encoding uses the codec actually negotiated in SDP answers, not just the first offered codec.
🚀 Getting Started
Prerequisites
- Node.js ≥ 20 with
tsxglobally available - pnpm for package management
- Rust toolchain (for building the proxy engine)
- espeak-ng (optional, for TTS fallback)
Install & Build
# Clone and install dependencies
pnpm install
# Build the Rust proxy-engine binary
pnpm run buildRust
# Bundle the web frontend
pnpm run bundle
Configuration
Create .nogit/config.json:
{
"proxy": {
"lanIp": "192.168.1.100", // Your server's LAN IP
"lanPort": 5070, // SIP signaling port
"publicIpSeed": "stun.example.com", // STUN server for public IP discovery
"rtpPortRange": { "min": 20000, "max": 20200 }, // RTP port pool (even ports)
"webUiPort": 3060 // Dashboard + REST API port
},
"providers": [
{
"id": "my-trunk",
"displayName": "My SIP Provider",
"domain": "sip.provider.com",
"outboundProxy": { "address": "sip.provider.com", "port": 5060 },
"username": "user",
"password": "pass",
"codecs": [9, 0, 8, 101], // G.722, PCMU, PCMA, telephone-event
"registerIntervalSec": 300
}
],
"devices": [
{
"id": "desk-phone",
"displayName": "Desk Phone",
"expectedAddress": "192.168.1.50",
"extension": "100"
}
],
"routing": {
"routes": [
{
"id": "inbound-default",
"name": "Ring all devices",
"priority": 100,
"direction": "inbound",
"match": {},
"action": {
"targets": ["desk-phone"],
"ringBrowsers": true,
"voicemailBox": "main",
"noAnswerTimeout": 25
}
},
{
"id": "outbound-default",
"name": "Route via trunk",
"priority": 100,
"direction": "outbound",
"match": {},
"action": { "provider": "my-trunk" }
}
]
},
"voiceboxes": [
{
"id": "main",
"enabled": true,
"greetingText": "Please leave a message after the beep.",
"greetingVoice": "af_bella",
"noAnswerTimeoutSec": 25,
"maxRecordingSec": 120,
"maxMessages": 50
}
],
"contacts": [
{ "id": "1", "name": "Alice", "number": "+491234567890", "starred": true }
]
}
TTS Setup (Optional)
For neural announcements and voicemail greetings, download the Kokoro TTS model:
mkdir -p .nogit/tts
curl -L -o .nogit/tts/kokoro-v1.0.onnx \
https://github.com/mzdk100/kokoro/releases/download/V1.0/kokoro-v1.0.onnx
curl -L -o .nogit/tts/voices.bin \
https://github.com/mzdk100/kokoro/releases/download/V1.0/voices.bin
Without the model files, TTS falls back to espeak-ng. Without either, announcements are skipped — everything else works fine.
Run
pnpm start
The SIP proxy starts on the configured port and the web dashboard is available at https://<your-ip>:3060.
HTTPS (Optional)
Place cert.pem and key.pem in .nogit/ for TLS on the dashboard.
📂 Project Structure
siprouter/
├── ts/ # TypeScript control plane
│ ├── sipproxy.ts # Main entry — bootstraps everything
│ ├── config.ts # Config loader & validation
│ ├── proxybridge.ts # Rust proxy-engine IPC bridge (smartrust)
│ ├── frontend.ts # Web dashboard HTTP/WS server + REST API
│ ├── webrtcbridge.ts # WebRTC signaling layer
│ ├── registrar.ts # Browser softphone registration
│ ├── announcement.ts # TTS announcement generator (espeak-ng / Kokoro)
│ ├── voicebox.ts # Voicemail box management
│ └── call/
│ └── prompt-cache.ts # Named audio prompt WAV management
│
├── ts_web/ # Web frontend (Lit-based SPA)
│ ├── elements/ # Web components (9 dashboard views)
│ └── state/ # App state, WebRTC client, notifications
│
├── rust/ # Rust workspace (the data plane)
│ └── crates/
│ ├── codec-lib/ # Audio codec library (Opus/G.722/PCMU/PCMA)
│ ├── sip-proto/ # Zero-dependency SIP protocol library
│ └── proxy-engine/ # Main binary — SIP engine + mixer + RTP
│
├── html/ # Static HTML shell
├── .nogit/ # Secrets, config, TTS models (gitignored)
└── dist_rust/ # Compiled Rust binary (gitignored)
🎧 Audio Engine (Rust)
The proxy-engine binary handles all real-time audio processing with a 48kHz f32 internal bus — encoding and decoding happens only at leg boundaries.
Supported Codecs
| Codec | PT | Native Rate | Use Case |
|---|---|---|---|
| Opus | 111 | 48 kHz | WebRTC browsers (native float encode/decode — zero i16 quantization) |
| G.722 | 9 | 16 kHz | HD SIP devices & providers |
| PCMU (G.711 µ-law) | 0 | 8 kHz | Legacy SIP |
| PCMA (G.711 A-law) | 8 | 8 kHz | Legacy SIP |
Audio Pipeline
Inbound: Wire RTP → Jitter Buffer → Decode → Resample to 48kHz → Denoise (RNNoise) → Mix Bus
Outbound: Mix Bus → Mix-Minus → Resample to codec rate → Encode → Wire RTP
- Adaptive jitter buffer — per-leg
BTreeMap-based buffer keyed by RTP sequence number. Delivers exactly one frame per 20ms mixer tick in sequence order. Adaptive target depth starts at 3 frames (60ms) and adjusts between 2–6 frames based on observed network jitter. Handles hold/resume by detecting large forward sequence jumps and resetting cleanly. - Packet loss concealment (PLC) — on missing packets, Opus legs invoke the decoder's built-in PLC (
decode(None)) to synthesize a smooth fill frame. Non-Opus legs (G.722, PCMU) apply exponential fade (0.85×) toward silence to avoid hard discontinuities. - FFT-based resampling via
rubato— high-quality sinc interpolation with canonical 20ms chunk sizes to ensure consistent resampler state across frames, preventing filter discontinuities - ML noise suppression via
nnnoiseless(RNNoise) — per-leg inbound denoising with SIMD acceleration (AVX/SSE). Skipped for WebRTC legs (browsers already denoise via getUserMedia) - Mix-minus mixing — each participant hears everyone except themselves, accumulated in f64 precision
- RFC 3550 compliant header parsing — properly handles CSRC lists and header extensions
🗣️ Neural TTS
Announcements and voicemail greetings are synthesized using Kokoro TTS — an 82M parameter neural model running via ONNX Runtime directly in the Rust process:
- 24 kHz, 16-bit mono output
- 25+ voice presets — American/British, male/female (e.g.,
af_bella,am_adam,bf_emma,bm_george) - ~800ms synthesis time for a 3-second phrase
- Lazy-loaded on first use — no startup cost if TTS is unused
- Falls back to
espeak-ngif the ONNX model is not available
📧 Voicemail
- Configurable voicemail boxes with custom TTS greetings (text + voice) or uploaded WAV
- Automatic routing on no-answer timeout (configurable, default 25s)
- Recording with configurable max duration (default 120s) and message count limit (default 50)
- Unheard message tracking for MWI (message waiting indication)
- Web dashboard playback and management
- WAV storage in
.nogit/voicemail/
🔢 IVR (Interactive Voice Response)
- DTMF-navigable menus with configurable entries
- Actions: route to extension, route to voicemail, transfer, submenu, hangup, repeat prompt
- Custom TTS prompts per menu
- Nested menu support
🌐 Web Dashboard & REST API
Dashboard Views
| View | Description |
|---|---|
| 📊 Overview | Stats tiles — uptime, providers, devices, active calls |
| 📞 Calls | Active calls with leg details, codec info, add/remove legs, transfer, hangup |
| ☎️ Phone | Browser softphone — mic/speaker selection, audio meters, dial pad, incoming call popup |
| 🔀 Routes | Routing rule management — match/action model with priority |
| 📧 Voicemail | Voicemail box management + message playback |
| 🔢 IVR | IVR menu builder — DTMF entries, TTS prompts, nested menus |
| 👤 Contacts | Contact management with click-to-call |
| 🔌 Providers | SIP trunk configuration and registration status |
| 📋 Log | Live streaming log viewer |
REST API
| Endpoint | Method | Description |
|---|---|---|
/api/status |
GET | Full system status (providers, devices, calls, history) |
/api/call |
POST | Originate a call |
/api/hangup |
POST | Hang up a call |
/api/call/:id/addleg |
POST | Add a device leg to an active call |
/api/call/:id/addexternal |
POST | Add an external participant via provider |
/api/call/:id/removeleg |
POST | Remove a leg from a call |
/api/transfer |
POST | Transfer a call |
/api/config |
GET | Read current configuration |
/api/config |
POST | Update configuration (hot-reload) |
/api/voicemail/:box |
GET | List voicemail messages |
/api/voicemail/:box/unheard |
GET | Get unheard message count |
/api/voicemail/:box/:id/audio |
GET | Stream voicemail audio |
/api/voicemail/:box/:id/heard |
POST | Mark a voicemail message as heard |
/api/voicemail/:box/:id |
DELETE | Delete a voicemail message |
WebSocket Events
Connect to /ws for real-time push:
{ "type": "status", "data": { ... } } // Full status snapshot (1s interval)
{ "type": "log", "data": { "message": "..." } } // Log lines in real-time
{ "type": "call-update", "data": { ... } } // Call state change notification
{ "type": "webrtc-answer", "data": { ... } } // WebRTC SDP answer for browser calls
{ "type": "webrtc-error", "data": { ... } } // WebRTC signaling error
Browser → server signaling:
{ "type": "webrtc-offer", "data": { ... } } // Browser sends SDP offer
{ "type": "webrtc-accept", "data": { ... } } // Browser accepts incoming call
{ "type": "webrtc-ice", "data": { ... } } // ICE candidate exchange
{ "type": "webrtc-hangup", "data": { ... } } // Browser hangs up
🔌 Ports
| Port | Protocol | Purpose |
|---|---|---|
| 5070 (configurable) | UDP | SIP signaling |
| 20000–20200 (configurable) | UDP | RTP media (even ports, per-call allocation) |
| 3060 (configurable) | TCP | Web dashboard + WebSocket + REST API |
🛠️ Development
# Start in dev mode
pnpm start
# Build Rust proxy-engine
pnpm run buildRust
# Bundle web frontend
pnpm run bundle
# Build + bundle + restart background server
pnpm run restartBackground
License and Legal Information
This repository contains open-source code licensed under the MIT License. A copy of the license can be found in the LICENSE file.
Please note: The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file.
Trademarks
This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH or third parties, and are not included within the scope of the MIT license granted herein.
Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines or the guidelines of the respective third-party owners, and any usage must be approved in writing. Third-party trademarks used herein are the property of their respective owners and used only in a descriptive manner, e.g. for an implementation of an API or similar.
Company Information
Task Venture Capital GmbH
Registered at District Court Bremen HRB 35230 HB, Germany
For any legal inquiries or further information, please contact us via email at hello@task.vc.
By using this repository, you acknowledge that you have read this section, agree to comply with its terms, and understand that the licensing of the code does not imply endorsement by Task Venture Capital GmbH of any derivative works.