A production-grade SIP B2BUA + WebRTC bridge built with TypeScript and Rust. Routes calls between SIP providers, SIP hardware devices, and browser softphones — with real-time codec transcoding, adaptive jitter buffering, ML noise suppression, neural TTS, voicemail, IVR menus, and a slick web dashboard.

Issue Reporting and Security

For reporting bugs, issues, or security vulnerabilities, please visit community.foss.global/. This is the central community hub for all issue reporting. Developers who sign and comply with our contribution agreement and go through identification can also get a code.foss.global/ account to submit Pull Requests directly.

🔥 What It Does

siprouter sits between your SIP trunk providers and your endpoints — hardware phones, ATAs, browser softphones — and handles everything in between:

📞 SIP B2BUA — Terminates and re-originates calls with full RFC 3261 dialog state management, digest auth, and SDP negotiation
🌐 WebRTC Bridge — Browser-based softphone with bidirectional Opus audio to the SIP network
🎛️ Multi-Provider Trunking — Register with multiple SIP providers simultaneously (sipgate, easybell, etc.) with automatic failover
🎧 48kHz f32 Audio Engine — High-fidelity internal audio bus at 48kHz/32-bit float with native Opus float encode/decode, FFT-based resampling, and per-leg ML noise suppression
🔀 N-Leg Mix-Minus Mixer — Conference-grade mixing with dynamic leg add/remove, transfer, and per-source audio separation
🎯 Adaptive Jitter Buffer — Per-leg jitter buffering with sequence-based reordering, adaptive depth (60–120ms), Opus PLC for lost packets, and hold/resume detection
📧 Voicemail — Configurable voicemail boxes with TTS greetings, recording, and web playback
🔢 IVR Menus — DTMF-navigable interactive voice response with nested menus, routing actions, and custom prompts
🗣️ Neural TTS — Kokoro-powered greetings and IVR prompts with 25+ voice presets
🎙️ Call Recording — Per-source separated WAV recording at 48kHz via tool legs
🖥️ Web Dashboard — Real-time SPA with 9 views: live calls, browser phone, routing, voicemail, IVR, contacts, providers, and streaming logs

🏗️ Architecture

flowchart TB
    Browser["🌐 Browser Softphone<br/>(WebRTC via WebSocket signaling)"]
    Devices["📞 SIP Devices<br/>(HT801, desk phones, ATAs)"]
    Trunks["☎️ SIP Trunk Providers<br/>(sipgate, easybell, …)"]

    subgraph Router["siprouter"]
        direction TB
        subgraph TS["TypeScript Control Plane"]
            TSBits["Config · WebRTC Signaling<br/>REST API · Web Dashboard<br/>Voicebox Manager · TTS Cache"]
        end
        subgraph Rust["Rust proxy-engine (data plane)"]
            RustBits["SIP Stack · Dialog SM · Auth<br/>Call Manager · N-Leg Mixer<br/>48kHz f32 Bus · Jitter Buffer<br/>Codec Engine · RTP Port Pool<br/>WebRTC Engine · Kokoro TTS<br/>Voicemail · IVR · Recording"]
        end
        TS <-->|"JSON-over-stdio IPC"| Rust
    end

    Browser <-->|"Opus / WebRTC"| TS
    Rust <-->|"SIP / RTP"| Devices
    Rust <-->|"SIP / RTP"| Trunks

🧠 Key Design Decisions

Hub Model — Every call is a hub with N legs. Each leg is a SipLeg (device/provider) or WebRtcLeg (browser). Legs can be dynamically added, removed, or transferred without tearing down the call.
Rust Data Plane — All SIP protocol handling, codec transcoding, mixing, and RTP I/O runs in native Rust for real-time performance. TypeScript handles config, signaling, REST API, and dashboard.
48kHz f32 Internal Bus — Audio is processed at maximum quality internally. Encoding/decoding to wire format (G.722, PCMU, Opus) happens solely at the leg boundary.
Per-Session Codec Isolation — Each call leg gets its own encoder/decoder/resampler/denoiser state — no cross-call corruption.
SDP Codec Negotiation — Outbound encoding uses the codec actually negotiated in SDP answers, not just the first offered codec.

📲 WebRTC Browser Call Flow

Browser calls are set up in a strict three-step dance — the WebRTC leg cannot be attached at call-creation time because the browser's session ID is only known once the SDP offer arrives:

sequenceDiagram
    participant B as Browser
    participant TS as TypeScript (sipproxy.ts)
    participant R as Rust proxy-engine
    participant P as SIP Provider

    B->>TS: POST /api/call
    TS->>R: make_call (pending call, no WebRTC leg yet)
    R-->>TS: call_created
    TS-->>B: webrtc-incoming (callId)

    B->>TS: webrtc-offer (sessionId, SDP)
    TS->>R: handle_webrtc_offer
    R-->>TS: webrtc-answer (SDP)
    TS-->>B: webrtc-answer
    Note over R: Standalone WebRTC session<br/>(not yet attached to call)

    B->>TS: webrtc_link (callId + sessionId)
    TS->>R: link session → call
    R->>R: wire WebRTC leg through mixer
    R->>P: SIP INVITE
    P-->>R: 200 OK + SDP
    R-->>TS: call_answered
    Note over B,P: Bidirectional Opus ↔ codec-transcoded<br/>audio flows through the mixer

🚀 Getting Started

Prerequisites

Node.js ≥ 20 with tsx globally available
pnpm for package management
Rust toolchain (for building the proxy engine)

Install & Build

# Clone and install dependencies
pnpm install

# Build the Rust proxy-engine binary
pnpm run buildRust

# Bundle the web frontend
pnpm run bundle

Configuration

Create .nogit/config.json:

{
  "proxy": {
    "lanIp": "192.168.1.100",          // Your server's LAN IP
    "lanPort": 5070,                    // SIP signaling port
    "publicIpSeed": "stun.example.com", // STUN server for public IP discovery
    "rtpPortRange": { "min": 20000, "max": 20200 }, // RTP port pool (even ports)
    "webUiPort": 3060                   // Dashboard + REST API port
  },
  "providers": [
    {
      "id": "my-trunk",
      "displayName": "My SIP Provider",
      "domain": "sip.provider.com",
      "outboundProxy": { "address": "sip.provider.com", "port": 5060 },
      "username": "user",
      "password": "pass",
      "codecs": [9, 0, 8, 101],        // G.722, PCMU, PCMA, telephone-event
      "registerIntervalSec": 300
    }
  ],
  "devices": [
    {
      "id": "desk-phone",
      "displayName": "Desk Phone",
      "expectedAddress": "192.168.1.50",
      "extension": "100"
    }
  ],
  "routing": {
    "routes": [
      {
        "id": "inbound-main-did",
        "name": "Main DID",
        "priority": 200,
        "enabled": true,
        "match": {
          "direction": "inbound",
          "sourceProvider": "my-trunk",
          "numberPattern": "+49421219694"
        },
        "action": {
          "targets": ["desk-phone"],
          "ringBrowsers": true,
          "voicemailBox": "main"
        }
      },
      {
        "id": "inbound-support-did",
        "name": "Support DID",
        "priority": 190,
        "enabled": true,
        "match": {
          "direction": "inbound",
          "sourceProvider": "my-trunk",
          "numberPattern": "+49421219695"
        },
        "action": {
          "ivrMenuId": "support-menu"
        }
      },
      {
        "id": "outbound-default",
        "name": "Route via trunk",
        "priority": 100,
        "enabled": true,
        "match": { "direction": "outbound" },
        "action": { "provider": "my-trunk" }
      }
    ]
  },
  "voiceboxes": [
    {
      "id": "main",
      "enabled": true,
      "greetingText": "Please leave a message after the beep.",
      "greetingVoice": "af_bella",
      "noAnswerTimeoutSec": 25,
      "maxRecordingSec": 120,
      "maxMessages": 50
    }
  ],
  "contacts": [
    { "id": "1", "name": "Alice", "number": "+491234567890", "starred": true }
  ]
}

Inbound number ownership is explicit: add one inbound route per DID (or DID prefix) and scope it with sourceProvider when a provider delivers multiple external numbers.

TTS Setup (Optional)

For neural voicemail greetings and IVR prompts, download the Kokoro TTS model:

mkdir -p .nogit/tts
curl -L -o .nogit/tts/kokoro-v1.0.onnx \
  https://github.com/mzdk100/kokoro/releases/download/V1.0/kokoro-v1.0.onnx
curl -L -o .nogit/tts/voices.bin \
  https://github.com/mzdk100/kokoro/releases/download/V1.0/voices.bin

Without the model files, TTS prompts (IVR menus, voicemail greetings) are skipped — everything else works fine.

Run

pnpm start

The SIP proxy starts on the configured port and the web dashboard is available at https://<your-ip>:3060.

HTTPS (Optional)

Place cert.pem and key.pem in .nogit/ for TLS on the dashboard.

📂 Project Structure

siprouter/
├── ts/                            # TypeScript control plane
│   ├── sipproxy.ts                # Main entry — bootstraps everything
│   ├── config.ts                  # Config loader & validation
│   ├── proxybridge.ts             # Rust proxy-engine IPC bridge (smartrust)
│   ├── frontend.ts                # Web dashboard HTTP/WS server + REST API
│   ├── webrtcbridge.ts            # WebRTC signaling layer
│   ├── registrar.ts               # Browser softphone registration
│   ├── voicebox.ts                # Voicemail box management
│   └── call/
│       └── prompt-cache.ts        # Named audio prompt WAV management
│
├── ts_web/                        # Web frontend (Lit-based SPA)
│   ├── elements/                  # Web components (9 dashboard views)
│   └── state/                     # App state, WebRTC client, notifications
│
├── rust/                          # Rust workspace (the data plane)
│   └── crates/
│       ├── codec-lib/             # Audio codec library (Opus/G.722/PCMU/PCMA)
│       ├── sip-proto/             # Zero-dependency SIP protocol library
│       └── proxy-engine/          # Main binary — SIP engine + mixer + RTP
│
├── html/                          # Static HTML shell
├── .nogit/                        # Secrets, config, TTS models (gitignored)
└── dist_rust/                     # Compiled Rust binary (gitignored)

🎧 Audio Engine (Rust)

The proxy-engine binary handles all real-time audio processing with a 48kHz f32 internal bus — encoding and decoding happens only at leg boundaries.

Supported Codecs

Codec	PT	Native Rate	Use Case
Opus	111	48 kHz	WebRTC browsers (native float encode/decode — zero i16 quantization)
G.722	9	16 kHz	HD SIP devices & providers
PCMU (G.711 µ-law)	0	8 kHz	Legacy SIP
PCMA (G.711 A-law)	8	8 kHz	Legacy SIP

Audio Pipeline

flowchart LR
    subgraph Inbound["Inbound path (per leg)"]
        direction LR
        IN_RTP["Wire RTP"] --> IN_JB["Jitter Buffer"] --> IN_DEC["Decode"] --> IN_RS["Resample → 48 kHz"] --> IN_DN["Denoise (RNNoise)"] --> IN_BUS["Mix Bus"]
    end

    subgraph Outbound["Outbound path (per leg)"]
        direction LR
        OUT_BUS["Mix Bus"] --> OUT_MM["Mix-Minus"] --> OUT_RS["Resample → codec rate"] --> OUT_ENC["Encode"] --> OUT_RTP["Wire RTP"]
    end

Adaptive jitter buffer — per-leg BTreeMap-based buffer keyed by RTP sequence number. Delivers exactly one frame per 20ms mixer tick in sequence order. Adaptive target depth starts at 3 frames (60ms) and adjusts between 2–6 frames based on observed network jitter. Handles hold/resume by detecting large forward sequence jumps and resetting cleanly.
Packet loss concealment (PLC) — on missing packets, Opus legs invoke the decoder's built-in PLC (decode(None)) to synthesize a smooth fill frame. Non-Opus legs (G.722, PCMU) apply exponential fade (0.85×) toward silence to avoid hard discontinuities.
FFT-based resampling via rubato — high-quality sinc interpolation with canonical 20ms chunk sizes to ensure consistent resampler state across frames, preventing filter discontinuities
ML noise suppression via nnnoiseless (RNNoise) — per-leg inbound denoising with SIMD acceleration (AVX/SSE). Skipped for WebRTC legs (browsers already denoise via getUserMedia)
Mix-minus mixing — each participant hears everyone except themselves, accumulated in f64 precision
RFC 3550 compliant header parsing — properly handles CSRC lists and header extensions

🗣️ Neural TTS

Voicemail greetings and IVR prompts are synthesized using Kokoro TTS — an 82M parameter neural model running via ONNX Runtime directly in the Rust process:

24 kHz, 16-bit mono output
25+ voice presets — American/British, male/female (e.g., af_bella, am_adam, bf_emma, bm_george)
~800ms synthesis time for a 3-second phrase
Lazy-loaded on first use — no startup cost if TTS is unused

📧 Voicemail

Configurable voicemail boxes with custom TTS greetings (text + voice) or uploaded WAV
Automatic routing on no-answer timeout (configurable, default 25s)
Recording with configurable max duration (default 120s) and message count limit (default 50)
Unheard message tracking for MWI (message waiting indication)
Web dashboard playback and management
WAV storage in .nogit/voicemail/

🔢 IVR (Interactive Voice Response)

DTMF-navigable menus with configurable entries
Actions: route to extension, route to voicemail, transfer, submenu, hangup, repeat prompt
Custom TTS prompts per menu
Nested menu support

🌐 Web Dashboard & REST API

Dashboard Views

View	Description
📊 Overview	Stats tiles — uptime, providers, devices, active calls
📞 Calls	Active calls with leg details, codec info, add/remove legs, transfer, hangup
☎️ Phone	Browser softphone — mic/speaker selection, audio meters, dial pad, incoming call popup
🔀 Routes	Routing rule management — match/action model with priority
📧 Voicemail	Voicemail box management + message playback
🔢 IVR	IVR menu builder — DTMF entries, TTS prompts, nested menus
👤 Contacts	Contact management with click-to-call
🔌 Providers	SIP trunk configuration and registration status
📋 Log	Live streaming log viewer

REST API

Endpoint	Method	Description
`/api/status`	GET	Full system status (providers, devices, calls, history)
`/api/call`	POST	Originate a call
`/api/hangup`	POST	Hang up a call
`/api/call/:id/addleg`	POST	Add a device leg to an active call
`/api/call/:id/addexternal`	POST	Add an external participant via provider
`/api/call/:id/removeleg`	POST	Remove a leg from a call
`/api/transfer`	POST	Transfer a call
`/api/config`	GET	Read current configuration
`/api/config`	POST	Update configuration (hot-reload)
`/api/voicemail/:box`	GET	List voicemail messages
`/api/voicemail/:box/unheard`	GET	Get unheard message count
`/api/voicemail/:box/:id/audio`	GET	Stream voicemail audio
`/api/voicemail/:box/:id/heard`	POST	Mark a voicemail message as heard
`/api/voicemail/:box/:id`	DELETE	Delete a voicemail message

WebSocket Events

Connect to /ws for real-time push:

{ "type": "status", "data": { ... } }           // Full status snapshot (1s interval)
{ "type": "log", "data": { "message": "..." } } // Log lines in real-time
{ "type": "call-update", "data": { ... } }      // Call state change notification
{ "type": "webrtc-answer", "data": { ... } }    // WebRTC SDP answer for browser calls
{ "type": "webrtc-error", "data": { ... } }     // WebRTC signaling error

Browser → server signaling:

{ "type": "webrtc-offer", "data": { ... } }     // Browser sends SDP offer
{ "type": "webrtc-accept", "data": { ... } }    // Browser accepts incoming call
{ "type": "webrtc-ice", "data": { ... } }       // ICE candidate exchange
{ "type": "webrtc-hangup", "data": { ... } }    // Browser hangs up

🔌 Ports

Port	Protocol	Purpose
5070 (configurable)	UDP	SIP signaling
20000–20200 (configurable)	UDP	RTP media (even ports, per-call allocation)
3060 (configurable)	TCP	Web dashboard + WebSocket + REST API

🛠️ Development

# Start in dev mode
pnpm start

# Build Rust proxy-engine
pnpm run buildRust

# Bundle web frontend
pnpm run bundle

# Build + bundle + restart background server
pnpm run restartBackground

License and Legal Information

This repository contains open-source code licensed under the MIT License. A copy of the license can be found in the LICENSE file.

Please note: The MIT License does not grant permission to use the trade names, trademarks, service marks, or product names of the project, except as required for reasonable and customary use in describing the origin of the work and reproducing the content of the NOTICE file.

Trademarks

This project is owned and maintained by Task Venture Capital GmbH. The names and logos associated with Task Venture Capital GmbH and any related products or services are trademarks of Task Venture Capital GmbH or third parties, and are not included within the scope of the MIT license granted herein.

Use of these trademarks must comply with Task Venture Capital GmbH's Trademark Guidelines or the guidelines of the respective third-party owners, and any usage must be approved in writing. Third-party trademarks used herein are the property of their respective owners and used only in a descriptive manner, e.g. for an implementation of an API or similar.

readme.md Unescape Escape

@serve.zone/siprouter