Compare commits

...

24 Commits

Author SHA1 Message Date
982f648928 v4.6.1 2026-03-16 22:46:51 +00:00
3a2a060a85 fix(remoteingress-core): avoid spurious tunnel disconnect events and increase control channel capacity 2026-03-16 22:46:51 +00:00
e0c469147e v4.6.0 2026-03-16 19:37:06 +00:00
0fdcdf566e feat(remoteingress-core): add adaptive per-stream flow control based on active stream counts 2026-03-16 19:37:06 +00:00
a808d4c9de v4.5.12 2026-03-16 17:39:25 +00:00
f8a0171ef3 fix(remoteingress-core): improve tunnel liveness handling and enable TCP keepalive for accepted client sockets 2026-03-16 17:39:25 +00:00
1d59a48648 v4.5.11 2026-03-16 13:55:02 +00:00
af2ec11a2d fix(repo): no changes to commit 2026-03-16 13:55:02 +00:00
b6e66a7fa6 v4.5.10 2026-03-16 13:48:35 +00:00
1391b39601 fix(remoteingress-core): guard zero-window reads to avoid false EOF handling on stalled streams 2026-03-16 13:48:35 +00:00
e813c2f044 v4.5.9 2026-03-16 11:29:38 +00:00
0b8c1f0b57 fix(remoteingress-core): delay stream close until downstream response draining finishes to prevent truncated transfers 2026-03-16 11:29:38 +00:00
a63dbf2502 v4.5.8 2026-03-16 10:51:59 +00:00
4b95a3c999 fix(remoteingress-core): ensure upstream writes cancel promptly and reliably deliver CLOSE_BACK frames 2026-03-16 10:51:59 +00:00
51ab32f6c3 v4.5.7 2026-03-16 09:44:31 +00:00
ed52520d50 fix(remoteingress-core): improve tunnel reconnect and frame write efficiency 2026-03-16 09:44:31 +00:00
a08011d2da v4.5.6 2026-03-16 09:36:03 +00:00
679b247c8a fix(remoteingress-core): disable Nagle's algorithm on edge, hub, and upstream TCP sockets to reduce control-frame latency 2026-03-16 09:36:03 +00:00
32f9845495 v4.5.5 2026-03-16 09:02:02 +00:00
c0e1daa0e4 fix(remoteingress-core): wait for hub-to-client draining before cleanup and reliably send close frames 2026-03-16 09:02:02 +00:00
fd511c8a5c v4.5.4 2026-03-15 21:06:44 +00:00
c490e35a8f fix(remoteingress-core): preserve stream close ordering and add flow-control stall timeouts 2026-03-15 21:06:44 +00:00
579e553da0 v4.5.3 2026-03-15 19:26:39 +00:00
a8ee0b33d7 fix(remoteingress-core): prioritize control frames over data in edge and hub tunnel writers 2026-03-15 19:26:39 +00:00
8 changed files with 310 additions and 67 deletions

View File

@@ -1,5 +1,81 @@
# Changelog # Changelog
## 2026-03-16 - 4.6.1 - fix(remoteingress-core)
avoid spurious tunnel disconnect events and increase control channel capacity
- Emit TunnelDisconnected only after an established connection is actually lost, preventing false disconnect events during failed reconnect attempts.
- Increase edge and hub control-channel buffer sizes from 64 to 256 to better prioritize control frames under load.
## 2026-03-16 - 4.6.0 - feat(remoteingress-core)
add adaptive per-stream flow control based on active stream counts
- Track active stream counts on edge and hub connections to size per-stream flow control windows dynamically.
- Cap WINDOW_UPDATE increments and read sizes to the adaptive window so bandwidth is shared more evenly across concurrent streams.
- Apply the adaptive logic to both upload and download paths on edge and hub stream handlers.
## 2026-03-16 - 4.5.12 - fix(remoteingress-core)
improve tunnel liveness handling and enable TCP keepalive for accepted client sockets
- Avoid disconnecting edges when PING or PONG frames cannot be queued because the control channel is temporarily full.
- Enable TCP_NODELAY and TCP keepalive on accepted client connections to help detect stale or dropped clients.
## 2026-03-16 - 4.5.11 - fix(repo)
no changes to commit
## 2026-03-16 - 4.5.10 - fix(remoteingress-core)
guard zero-window reads to avoid false EOF handling on stalled streams
- Prevent upload and download loops from calling read on an empty buffer when flow-control window remains at 0 after stall timeout
- Log a warning and close the affected stream instead of misinterpreting Ok(0) as end-of-file
## 2026-03-16 - 4.5.9 - fix(remoteingress-core)
delay stream close until downstream response draining finishes to prevent truncated transfers
- Waits for the hub-to-client download task to finish before sending the stream CLOSE frame
- Prevents upstream reads from being cancelled mid-response during asymmetric transfers such as git fetch
- Retains the existing timeout so stalled downloads still clean up safely
## 2026-03-16 - 4.5.8 - fix(remoteingress-core)
ensure upstream writes cancel promptly and reliably deliver CLOSE_BACK frames
- listen for stream cancellation while waiting on upstream write timeouts so FRAME_CLOSE does not block for up to 60 seconds
- replace try_send with send().await when emitting CLOSE_BACK frames to avoid silently dropping close notifications when the data channel is full
## 2026-03-16 - 4.5.7 - fix(remoteingress-core)
improve tunnel reconnect and frame write efficiency
- Reuse the TLS connector across edge reconnections to preserve session resumption state and reduce reconnect latency.
- Buffer hub and edge frame writes to coalesce small control and data frames into fewer TLS records and syscalls while still flushing each frame promptly.
## 2026-03-16 - 4.5.6 - fix(remoteingress-core)
disable Nagle's algorithm on edge, hub, and upstream TCP sockets to reduce control-frame latency
- Enable TCP_NODELAY on the edge connection to the hub for faster PING/PONG and WINDOW_UPDATE delivery
- Apply TCP_NODELAY on accepted hub streams before TLS handling
- Enable TCP_NODELAY on SmartProxy upstream connections before sending the PROXY header
## 2026-03-16 - 4.5.5 - fix(remoteingress-core)
wait for hub-to-client draining before cleanup and reliably send close frames
- switch CLOSE frame delivery on the data channel from try_send to send().await to avoid dropping it when the channel is full
- delay stream cleanup until the hub-to-client task finishes or times out so large downstream responses continue after upload EOF
- add a bounded 5-minute wait for download draining to prevent premature termination of asymmetric transfers such as git fetch
## 2026-03-15 - 4.5.4 - fix(remoteingress-core)
preserve stream close ordering and add flow-control stall timeouts
- Send CLOSE and CLOSE_BACK frames on the data channel so they arrive after the final stream data frames.
- Log and abort stalled upload and download paths when flow-control windows stay empty for 120 seconds.
- Apply a 60-second timeout when writing buffered stream data to the upstream connection to prevent hung streams.
## 2026-03-15 - 4.5.3 - fix(remoteingress-core)
prioritize control frames over data in edge and hub tunnel writers
- Split tunnel/frame writers into separate control and data channels in edge and hub
- Use biased select loops so PING, PONG, WINDOW_UPDATE, OPEN, and CLOSE frames are sent before data frames
- Route stream data through dedicated data channels while keeping OPEN, CLOSE, and flow-control updates on control channels to prevent keepalive starvation under load
## 2026-03-15 - 4.5.2 - fix(remoteingress-core) ## 2026-03-15 - 4.5.2 - fix(remoteingress-core)
improve stream flow control retries and increase channel buffer capacity improve stream flow control retries and increase channel buffer capacity

View File

@@ -1,6 +1,6 @@
{ {
"name": "@serve.zone/remoteingress", "name": "@serve.zone/remoteingress",
"version": "4.5.2", "version": "4.6.1",
"private": false, "private": false,
"description": "Edge ingress tunnel for DcRouter - accepts incoming TCP connections at network edge and tunnels them to DcRouter SmartProxy preserving client IP via PROXY protocol v1.", "description": "Edge ingress tunnel for DcRouter - accepts incoming TCP connections at network edge and tunnels them to DcRouter SmartProxy preserving client IP via PROXY protocol v1.",
"main": "dist_ts/index.js", "main": "dist_ts/index.js",

13
rust/Cargo.lock generated
View File

@@ -558,6 +558,7 @@ dependencies = [
"rustls-pemfile", "rustls-pemfile",
"serde", "serde",
"serde_json", "serde_json",
"socket2 0.5.10",
"tokio", "tokio",
"tokio-rustls", "tokio-rustls",
"tokio-util", "tokio-util",
@@ -701,6 +702,16 @@ version = "1.15.1"
source = "registry+https://github.com/rust-lang/crates.io-index" source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "67b1b7a3b5fe4f1376887184045fcf45c69e92af734b7aaddc05fb777b6fbd03" checksum = "67b1b7a3b5fe4f1376887184045fcf45c69e92af734b7aaddc05fb777b6fbd03"
[[package]]
name = "socket2"
version = "0.5.10"
source = "registry+https://github.com/rust-lang/crates.io-index"
checksum = "e22376abed350d73dd1cd119b57ffccad95b4e585a7cda43e286245ce23c0678"
dependencies = [
"libc",
"windows-sys 0.52.0",
]
[[package]] [[package]]
name = "socket2" name = "socket2"
version = "0.6.2" version = "0.6.2"
@@ -765,7 +776,7 @@ dependencies = [
"parking_lot", "parking_lot",
"pin-project-lite", "pin-project-lite",
"signal-hook-registry", "signal-hook-registry",
"socket2", "socket2 0.6.2",
"tokio-macros", "tokio-macros",
"windows-sys 0.61.2", "windows-sys 0.61.2",
] ]

View File

@@ -14,3 +14,4 @@ serde_json = "1"
log = "0.4" log = "0.4"
rustls-pemfile = "2" rustls-pemfile = "2"
tokio-util = "0.7" tokio-util = "0.7"
socket2 = "0.5"

View File

@@ -194,6 +194,14 @@ async fn edge_main_loop(
let mut backoff_ms: u64 = 1000; let mut backoff_ms: u64 = 1000;
let max_backoff_ms: u64 = 30000; let max_backoff_ms: u64 = 30000;
// Build TLS config ONCE outside the reconnect loop — preserves session
// cache across reconnections for TLS session resumption (saves 1 RTT).
let tls_config = rustls::ClientConfig::builder()
.dangerous()
.with_custom_certificate_verifier(Arc::new(NoCertVerifier))
.with_no_client_auth();
let connector = TlsConnector::from(Arc::new(tls_config));
loop { loop {
// Create a per-connection child token // Create a per-connection child token
let connection_token = cancel_token.child_token(); let connection_token = cancel_token.child_token();
@@ -209,6 +217,7 @@ async fn edge_main_loop(
&listen_ports, &listen_ports,
&mut shutdown_rx, &mut shutdown_rx,
&connection_token, &connection_token,
&connector,
) )
.await; .await;
@@ -223,7 +232,11 @@ async fn edge_main_loop(
} }
*connected.write().await = false; *connected.write().await = false;
let _ = event_tx.try_send(EdgeEvent::TunnelDisconnected); // Only emit disconnect event on actual disconnection, not on failed reconnects.
// Failed reconnects never reach line 335 (handshake success), so was_connected is false.
if was_connected {
let _ = event_tx.try_send(EdgeEvent::TunnelDisconnected);
}
active_streams.store(0, Ordering::Relaxed); active_streams.store(0, Ordering::Relaxed);
// Reset stream ID counter for next connection cycle // Reset stream ID counter for next connection cycle
next_stream_id.store(1, Ordering::Relaxed); next_stream_id.store(1, Ordering::Relaxed);
@@ -259,18 +272,16 @@ async fn connect_to_hub_and_run(
listen_ports: &Arc<RwLock<Vec<u16>>>, listen_ports: &Arc<RwLock<Vec<u16>>>,
shutdown_rx: &mut mpsc::Receiver<()>, shutdown_rx: &mut mpsc::Receiver<()>,
connection_token: &CancellationToken, connection_token: &CancellationToken,
connector: &TlsConnector,
) -> EdgeLoopResult { ) -> EdgeLoopResult {
// Build TLS connector that skips cert verification (auth is via secret)
let tls_config = rustls::ClientConfig::builder()
.dangerous()
.with_custom_certificate_verifier(Arc::new(NoCertVerifier))
.with_no_client_auth();
let connector = TlsConnector::from(Arc::new(tls_config));
let addr = format!("{}:{}", config.hub_host, config.hub_port); let addr = format!("{}:{}", config.hub_host, config.hub_port);
let tcp = match TcpStream::connect(&addr).await { let tcp = match TcpStream::connect(&addr).await {
Ok(s) => s, Ok(s) => {
// Disable Nagle's algorithm for low-latency control frames (PING/PONG, WINDOW_UPDATE)
let _ = s.set_nodelay(true);
s
}
Err(e) => { Err(e) => {
log::error!("Failed to connect to hub at {}: {}", addr, e); log::error!("Failed to connect to hub at {}: {}", addr, e);
return EdgeLoopResult::Reconnect; return EdgeLoopResult::Reconnect;
@@ -366,18 +377,34 @@ async fn connect_to_hub_and_run(
let client_writers: Arc<Mutex<HashMap<u32, EdgeStreamState>>> = let client_writers: Arc<Mutex<HashMap<u32, EdgeStreamState>>> =
Arc::new(Mutex::new(HashMap::new())); Arc::new(Mutex::new(HashMap::new()));
// A5: Channel-based tunnel writer replaces Arc<Mutex<WriteHalf>> // QoS dual-channel tunnel writer: control frames (PONG/WINDOW_UPDATE/CLOSE/OPEN)
let (tunnel_writer_tx, mut tunnel_writer_rx) = mpsc::channel::<Vec<u8>>(4096); // have priority over data frames (DATA). Prevents PING starvation under load.
let (tunnel_ctrl_tx, mut tunnel_ctrl_rx) = mpsc::channel::<Vec<u8>>(256);
let (tunnel_data_tx, mut tunnel_data_rx) = mpsc::channel::<Vec<u8>>(4096);
// Legacy alias — control channel for PONG, CLOSE, WINDOW_UPDATE, OPEN
let tunnel_writer_tx = tunnel_ctrl_tx.clone();
let tw_token = connection_token.clone(); let tw_token = connection_token.clone();
let tunnel_writer_handle = tokio::spawn(async move { let tunnel_writer_handle = tokio::spawn(async move {
// BufWriter coalesces small writes (frame headers, control frames) into fewer
// TLS records and syscalls. Flushed after each frame to avoid holding data.
let mut writer = tokio::io::BufWriter::with_capacity(65536, write_half);
loop { loop {
tokio::select! { tokio::select! {
data = tunnel_writer_rx.recv() => { biased; // control frames always take priority over data
ctrl = tunnel_ctrl_rx.recv() => {
match ctrl {
Some(frame_data) => {
if writer.write_all(&frame_data).await.is_err() { break; }
if writer.flush().await.is_err() { break; }
}
None => break,
}
}
data = tunnel_data_rx.recv() => {
match data { match data {
Some(frame_data) => { Some(frame_data) => {
if write_half.write_all(&frame_data).await.is_err() { if writer.write_all(&frame_data).await.is_err() { break; }
break; if writer.flush().await.is_err() { break; }
}
} }
None => break, None => break,
} }
@@ -393,6 +420,7 @@ async fn connect_to_hub_and_run(
&handshake.listen_ports, &handshake.listen_ports,
&mut port_listeners, &mut port_listeners,
&tunnel_writer_tx, &tunnel_writer_tx,
&tunnel_data_tx,
&client_writers, &client_writers,
active_streams, active_streams,
next_stream_id, next_stream_id,
@@ -458,6 +486,7 @@ async fn connect_to_hub_and_run(
&update.listen_ports, &update.listen_ports,
&mut port_listeners, &mut port_listeners,
&tunnel_writer_tx, &tunnel_writer_tx,
&tunnel_data_tx,
&client_writers, &client_writers,
active_streams, active_streams,
next_stream_id, next_stream_id,
@@ -469,8 +498,10 @@ async fn connect_to_hub_and_run(
FRAME_PING => { FRAME_PING => {
let pong_frame = encode_frame(0, FRAME_PONG, &[]); let pong_frame = encode_frame(0, FRAME_PONG, &[]);
if tunnel_writer_tx.try_send(pong_frame).is_err() { if tunnel_writer_tx.try_send(pong_frame).is_err() {
log::warn!("Failed to send PONG, writer channel full/closed"); // Control channel full (WINDOW_UPDATE burst from many streams).
break EdgeLoopResult::Reconnect; // DON'T disconnect — the 45s liveness timeout gives margin
// for the channel to drain and the next PONG to succeed.
log::warn!("PONG send failed, control channel full — skipping this cycle");
} }
log::trace!("Received PING from hub, sent PONG"); log::trace!("Received PING from hub, sent PONG");
} }
@@ -519,7 +550,8 @@ async fn connect_to_hub_and_run(
fn apply_port_config( fn apply_port_config(
new_ports: &[u16], new_ports: &[u16],
port_listeners: &mut HashMap<u16, JoinHandle<()>>, port_listeners: &mut HashMap<u16, JoinHandle<()>>,
tunnel_writer_tx: &mpsc::Sender<Vec<u8>>, tunnel_ctrl_tx: &mpsc::Sender<Vec<u8>>,
tunnel_data_tx: &mpsc::Sender<Vec<u8>>,
client_writers: &Arc<Mutex<HashMap<u32, EdgeStreamState>>>, client_writers: &Arc<Mutex<HashMap<u32, EdgeStreamState>>>,
active_streams: &Arc<AtomicU32>, active_streams: &Arc<AtomicU32>,
next_stream_id: &Arc<AtomicU32>, next_stream_id: &Arc<AtomicU32>,
@@ -539,7 +571,8 @@ fn apply_port_config(
// Add new ports // Add new ports
for &port in new_set.difference(&old_set) { for &port in new_set.difference(&old_set) {
let tunnel_writer_tx = tunnel_writer_tx.clone(); let tunnel_ctrl_tx = tunnel_ctrl_tx.clone();
let tunnel_data_tx = tunnel_data_tx.clone();
let client_writers = client_writers.clone(); let client_writers = client_writers.clone();
let active_streams = active_streams.clone(); let active_streams = active_streams.clone();
let next_stream_id = next_stream_id.clone(); let next_stream_id = next_stream_id.clone();
@@ -561,8 +594,18 @@ fn apply_port_config(
accept_result = listener.accept() => { accept_result = listener.accept() => {
match accept_result { match accept_result {
Ok((client_stream, client_addr)) => { Ok((client_stream, client_addr)) => {
// TCP keepalive detects dead clients that disappear without FIN.
// Without this, zombie streams accumulate and never get cleaned up.
let _ = client_stream.set_nodelay(true);
let ka = socket2::TcpKeepalive::new()
.with_time(Duration::from_secs(60));
#[cfg(target_os = "linux")]
let ka = ka.with_interval(Duration::from_secs(60));
let _ = socket2::SockRef::from(&client_stream).set_tcp_keepalive(&ka);
let stream_id = next_stream_id.fetch_add(1, Ordering::Relaxed); let stream_id = next_stream_id.fetch_add(1, Ordering::Relaxed);
let tunnel_writer_tx = tunnel_writer_tx.clone(); let tunnel_ctrl_tx = tunnel_ctrl_tx.clone();
let tunnel_data_tx = tunnel_data_tx.clone();
let client_writers = client_writers.clone(); let client_writers = client_writers.clone();
let active_streams = active_streams.clone(); let active_streams = active_streams.clone();
let edge_id = edge_id.clone(); let edge_id = edge_id.clone();
@@ -577,9 +620,11 @@ fn apply_port_config(
stream_id, stream_id,
port, port,
&edge_id, &edge_id,
tunnel_writer_tx, tunnel_ctrl_tx,
tunnel_data_tx,
client_writers, client_writers,
client_token, client_token,
Arc::clone(&active_streams),
) )
.await; .await;
active_streams.fetch_sub(1, Ordering::Relaxed); active_streams.fetch_sub(1, Ordering::Relaxed);
@@ -607,9 +652,11 @@ async fn handle_client_connection(
stream_id: u32, stream_id: u32,
dest_port: u16, dest_port: u16,
edge_id: &str, edge_id: &str,
tunnel_writer_tx: mpsc::Sender<Vec<u8>>, tunnel_ctrl_tx: mpsc::Sender<Vec<u8>>,
tunnel_data_tx: mpsc::Sender<Vec<u8>>,
client_writers: Arc<Mutex<HashMap<u32, EdgeStreamState>>>, client_writers: Arc<Mutex<HashMap<u32, EdgeStreamState>>>,
client_token: CancellationToken, client_token: CancellationToken,
active_streams: Arc<AtomicU32>,
) { ) {
let client_ip = client_addr.ip().to_string(); let client_ip = client_addr.ip().to_string();
let client_port = client_addr.port(); let client_port = client_addr.port();
@@ -617,10 +664,10 @@ async fn handle_client_connection(
// Determine edge IP (use 0.0.0.0 as placeholder — hub doesn't use it for routing) // Determine edge IP (use 0.0.0.0 as placeholder — hub doesn't use it for routing)
let edge_ip = "0.0.0.0"; let edge_ip = "0.0.0.0";
// Send OPEN frame with PROXY v1 header via writer channel // Send OPEN frame with PROXY v1 header via control channel
let proxy_header = build_proxy_v1_header(&client_ip, edge_ip, client_port, dest_port); let proxy_header = build_proxy_v1_header(&client_ip, edge_ip, client_port, dest_port);
let open_frame = encode_frame(stream_id, FRAME_OPEN, proxy_header.as_bytes()); let open_frame = encode_frame(stream_id, FRAME_OPEN, proxy_header.as_bytes());
if tunnel_writer_tx.send(open_frame).await.is_err() { if tunnel_ctrl_tx.send(open_frame).await.is_err() {
return; return;
} }
@@ -642,8 +689,9 @@ async fn handle_client_connection(
// Task: hub -> client (download direction) // Task: hub -> client (download direction)
// After writing to client TCP, send WINDOW_UPDATE to hub so it can send more // After writing to client TCP, send WINDOW_UPDATE to hub so it can send more
let hub_to_client_token = client_token.clone(); let hub_to_client_token = client_token.clone();
let wu_tx = tunnel_writer_tx.clone(); let wu_tx = tunnel_ctrl_tx.clone();
let hub_to_client = tokio::spawn(async move { let active_streams_h2c = Arc::clone(&active_streams);
let mut hub_to_client = tokio::spawn(async move {
let mut consumed_since_update: u32 = 0; let mut consumed_since_update: u32 = 0;
loop { loop {
tokio::select! { tokio::select! {
@@ -654,12 +702,20 @@ async fn handle_client_connection(
if client_write.write_all(&data).await.is_err() { if client_write.write_all(&data).await.is_err() {
break; break;
} }
// Track consumption for flow control // Track consumption for adaptive flow control.
// The increment is capped to the adaptive window so the sender's
// effective window shrinks to match current demand (fewer streams
// = larger window, more streams = smaller window per stream).
consumed_since_update += len; consumed_since_update += len;
if consumed_since_update >= WINDOW_UPDATE_THRESHOLD { let adaptive_window = remoteingress_protocol::compute_window_for_stream_count(
let frame = encode_window_update(stream_id, FRAME_WINDOW_UPDATE, consumed_since_update); active_streams_h2c.load(Ordering::Relaxed),
);
let threshold = adaptive_window / 2;
if consumed_since_update >= threshold {
let increment = consumed_since_update.min(adaptive_window);
let frame = encode_window_update(stream_id, FRAME_WINDOW_UPDATE, increment);
if wu_tx.try_send(frame).is_ok() { if wu_tx.try_send(frame).is_ok() {
consumed_since_update = 0; consumed_since_update -= increment;
} }
// If try_send fails, keep accumulating — retry on next threshold // If try_send fails, keep accumulating — retry on next threshold
} }
@@ -681,20 +737,35 @@ async fn handle_client_connection(
// Task: client -> hub (upload direction) with per-stream flow control // Task: client -> hub (upload direction) with per-stream flow control
let mut buf = vec![0u8; 32768]; let mut buf = vec![0u8; 32768];
loop { loop {
// Wait for send window to have capacity // Wait for send window to have capacity (with stall timeout)
loop { loop {
let w = send_window.load(Ordering::Acquire); let w = send_window.load(Ordering::Acquire);
if w > 0 { break; } if w > 0 { break; }
tokio::select! { tokio::select! {
_ = window_notify.notified() => continue, _ = window_notify.notified() => continue,
_ = client_token.cancelled() => break, _ = client_token.cancelled() => break,
_ = tokio::time::sleep(Duration::from_secs(120)) => {
log::warn!("Stream {} upload stalled (window empty for 120s)", stream_id);
break;
}
} }
} }
if client_token.is_cancelled() { break; } if client_token.is_cancelled() { break; }
// Limit read size to available window // Limit read size to available window.
// IMPORTANT: if window is 0 (stall timeout fired), we must NOT
// read into an empty buffer — read(&mut buf[..0]) returns Ok(0)
// which would be falsely interpreted as EOF.
let w = send_window.load(Ordering::Acquire) as usize; let w = send_window.load(Ordering::Acquire) as usize;
let max_read = w.min(buf.len()); if w == 0 {
log::warn!("Stream {} upload: window still 0 after stall timeout, closing", stream_id);
break;
}
// Adaptive: cap read to current per-stream target window
let adaptive_cap = remoteingress_protocol::compute_window_for_stream_count(
active_streams.load(Ordering::Relaxed),
) as usize;
let max_read = w.min(buf.len()).min(adaptive_cap);
tokio::select! { tokio::select! {
read_result = client_read.read(&mut buf[..max_read]) => { read_result = client_read.read(&mut buf[..max_read]) => {
@@ -703,8 +774,8 @@ async fn handle_client_connection(
Ok(n) => { Ok(n) => {
send_window.fetch_sub(n as u32, Ordering::Release); send_window.fetch_sub(n as u32, Ordering::Release);
let data_frame = encode_frame(stream_id, FRAME_DATA, &buf[..n]); let data_frame = encode_frame(stream_id, FRAME_DATA, &buf[..n]);
if tunnel_writer_tx.send(data_frame).await.is_err() { if tunnel_data_tx.send(data_frame).await.is_err() {
log::warn!("Stream {} tunnel writer closed, closing", stream_id); log::warn!("Stream {} data channel closed, closing", stream_id);
break; break;
} }
} }
@@ -715,18 +786,29 @@ async fn handle_client_connection(
} }
} }
// Send CLOSE frame (only if not cancelled) // Wait for the download task (hub → client) to finish BEFORE sending CLOSE.
// Upload EOF (client done sending) does NOT mean the response is done.
// For asymmetric transfers like git fetch (small request, large response),
// the response is still streaming when the upload finishes.
// Sending CLOSE before the response finishes would cause the hub to cancel
// the upstream reader mid-response, truncating the data.
let _ = tokio::time::timeout(
Duration::from_secs(300), // 5 min max wait for download to finish
&mut hub_to_client,
).await;
// NOW send CLOSE — the response has been fully delivered (or timed out).
if !client_token.is_cancelled() { if !client_token.is_cancelled() {
let close_frame = encode_frame(stream_id, FRAME_CLOSE, &[]); let close_frame = encode_frame(stream_id, FRAME_CLOSE, &[]);
let _ = tunnel_writer_tx.try_send(close_frame); let _ = tunnel_data_tx.send(close_frame).await;
} }
// Cleanup // Clean up
{ {
let mut writers = client_writers.lock().await; let mut writers = client_writers.lock().await;
writers.remove(&stream_id); writers.remove(&stream_id);
} }
hub_to_client.abort(); hub_to_client.abort(); // No-op if already finished; safety net if timeout fired
let _ = edge_id; // used for logging context let _ = edge_id; // used for logging context
} }

View File

@@ -298,6 +298,8 @@ async fn handle_edge_connection(
edge_token: CancellationToken, edge_token: CancellationToken,
peer_addr: String, peer_addr: String,
) -> Result<(), Box<dyn std::error::Error + Send + Sync>> { ) -> Result<(), Box<dyn std::error::Error + Send + Sync>> {
// Disable Nagle's algorithm for low-latency control frames (PING/PONG, WINDOW_UPDATE)
stream.set_nodelay(true)?;
let tls_stream = acceptor.accept(stream).await?; let tls_stream = acceptor.accept(stream).await?;
let (read_half, mut write_half) = tokio::io::split(tls_stream); let (read_half, mut write_half) = tokio::io::split(tls_stream);
let mut buf_reader = BufReader::new(read_half); let mut buf_reader = BufReader::new(read_half);
@@ -371,19 +373,37 @@ async fn handle_edge_connection(
); );
} }
// A5: Channel-based writer replaces Arc<Mutex<WriteHalf>> // Per-edge active stream counter for adaptive flow control
// All frame writes go through this channel → dedicated writer task serializes them let edge_stream_count = Arc::new(AtomicU32::new(0));
let (frame_writer_tx, mut frame_writer_rx) = mpsc::channel::<Vec<u8>>(4096);
// QoS dual-channel tunnel writer: control frames (PING/PONG/WINDOW_UPDATE/CLOSE)
// have priority over data frames (DATA_BACK). This prevents PING starvation under load.
let (ctrl_tx, mut ctrl_rx) = mpsc::channel::<Vec<u8>>(256);
let (data_tx, mut data_rx) = mpsc::channel::<Vec<u8>>(4096);
// Legacy alias for code that sends both control and data (will be migrated)
let frame_writer_tx = ctrl_tx.clone();
let writer_token = edge_token.clone(); let writer_token = edge_token.clone();
let writer_handle = tokio::spawn(async move { let writer_handle = tokio::spawn(async move {
// BufWriter coalesces small writes (frame headers, control frames) into fewer
// TLS records and syscalls. Flushed after each frame to avoid holding data.
let mut writer = tokio::io::BufWriter::with_capacity(65536, write_half);
loop { loop {
tokio::select! { tokio::select! {
data = frame_writer_rx.recv() => { biased; // control frames always take priority over data
ctrl = ctrl_rx.recv() => {
match ctrl {
Some(frame_data) => {
if writer.write_all(&frame_data).await.is_err() { break; }
if writer.flush().await.is_err() { break; }
}
None => break,
}
}
data = data_rx.recv() => {
match data { match data {
Some(frame_data) => { Some(frame_data) => {
if write_half.write_all(&frame_data).await.is_err() { if writer.write_all(&frame_data).await.is_err() { break; }
break; if writer.flush().await.is_err() { break; }
}
} }
None => break, None => break,
} }
@@ -467,7 +487,8 @@ async fn handle_edge_connection(
let edge_id_clone = edge_id.clone(); let edge_id_clone = edge_id.clone();
let event_tx_clone = event_tx.clone(); let event_tx_clone = event_tx.clone();
let streams_clone = streams.clone(); let streams_clone = streams.clone();
let writer_tx = frame_writer_tx.clone(); let writer_tx = ctrl_tx.clone(); // control: CLOSE_BACK, WINDOW_UPDATE_BACK
let data_writer_tx = data_tx.clone(); // data: DATA_BACK
let target = target_host.clone(); let target = target_host.clone();
let stream_token = edge_token.child_token(); let stream_token = edge_token.child_token();
@@ -491,8 +512,10 @@ async fn handle_edge_connection(
} }
// Spawn task: connect to SmartProxy, send PROXY header, pipe data // Spawn task: connect to SmartProxy, send PROXY header, pipe data
let stream_counter = Arc::clone(&edge_stream_count);
tokio::spawn(async move { tokio::spawn(async move {
let _permit = permit; // hold semaphore permit until stream completes let _permit = permit; // hold semaphore permit until stream completes
stream_counter.fetch_add(1, Ordering::Relaxed);
let result = async { let result = async {
// A2: Connect to SmartProxy with timeout // A2: Connect to SmartProxy with timeout
@@ -505,6 +528,7 @@ async fn handle_edge_connection(
format!("connect to SmartProxy {}:{} timed out (10s)", target, dest_port).into() format!("connect to SmartProxy {}:{} timed out (10s)", target, dest_port).into()
})??; })??;
upstream.set_nodelay(true)?;
upstream.write_all(proxy_header.as_bytes()).await?; upstream.write_all(proxy_header.as_bytes()).await?;
let (mut up_read, mut up_write) = let (mut up_read, mut up_write) =
@@ -514,6 +538,7 @@ async fn handle_edge_connection(
// After writing to upstream, send WINDOW_UPDATE_BACK to edge // After writing to upstream, send WINDOW_UPDATE_BACK to edge
let writer_token = stream_token.clone(); let writer_token = stream_token.clone();
let wub_tx = writer_tx.clone(); let wub_tx = writer_tx.clone();
let stream_counter_w = Arc::clone(&stream_counter);
let writer_for_edge_data = tokio::spawn(async move { let writer_for_edge_data = tokio::spawn(async move {
let mut consumed_since_update: u32 = 0; let mut consumed_since_update: u32 = 0;
loop { loop {
@@ -522,15 +547,35 @@ async fn handle_edge_connection(
match data { match data {
Some(data) => { Some(data) => {
let len = data.len() as u32; let len = data.len() as u32;
if up_write.write_all(&data).await.is_err() { // Check cancellation alongside the write so we respond
break; // promptly to FRAME_CLOSE instead of blocking up to 60s.
let write_result = tokio::select! {
r = tokio::time::timeout(
Duration::from_secs(60),
up_write.write_all(&data),
) => r,
_ = writer_token.cancelled() => break,
};
match write_result {
Ok(Ok(())) => {}
Ok(Err(_)) => break,
Err(_) => {
log::warn!("Stream {} write to upstream timed out (60s)", stream_id);
break;
}
} }
// Track consumption for flow control // Track consumption for adaptive flow control.
// Increment capped to adaptive window to limit per-stream in-flight data.
consumed_since_update += len; consumed_since_update += len;
if consumed_since_update >= WINDOW_UPDATE_THRESHOLD { let adaptive_window = remoteingress_protocol::compute_window_for_stream_count(
let frame = encode_window_update(stream_id, FRAME_WINDOW_UPDATE_BACK, consumed_since_update); stream_counter_w.load(Ordering::Relaxed),
);
let threshold = adaptive_window / 2;
if consumed_since_update >= threshold {
let increment = consumed_since_update.min(adaptive_window);
let frame = encode_window_update(stream_id, FRAME_WINDOW_UPDATE_BACK, increment);
if wub_tx.try_send(frame).is_ok() { if wub_tx.try_send(frame).is_ok() {
consumed_since_update = 0; consumed_since_update -= increment;
} }
// If try_send fails, keep accumulating — retry on next threshold // If try_send fails, keep accumulating — retry on next threshold
} }
@@ -553,20 +598,35 @@ async fn handle_edge_connection(
// with per-stream flow control (check send_window before reading) // with per-stream flow control (check send_window before reading)
let mut buf = vec![0u8; 32768]; let mut buf = vec![0u8; 32768];
loop { loop {
// Wait for send window to have capacity // Wait for send window to have capacity (with stall timeout)
loop { loop {
let w = send_window.load(Ordering::Acquire); let w = send_window.load(Ordering::Acquire);
if w > 0 { break; } if w > 0 { break; }
tokio::select! { tokio::select! {
_ = window_notify.notified() => continue, _ = window_notify.notified() => continue,
_ = stream_token.cancelled() => break, _ = stream_token.cancelled() => break,
_ = tokio::time::sleep(Duration::from_secs(120)) => {
log::warn!("Stream {} download stalled (window empty for 120s)", stream_id);
break;
}
} }
} }
if stream_token.is_cancelled() { break; } if stream_token.is_cancelled() { break; }
// Limit read size to available window // Limit read size to available window.
// IMPORTANT: if window is 0 (stall timeout fired), we must NOT
// read into an empty buffer — read(&mut buf[..0]) returns Ok(0)
// which would be falsely interpreted as EOF.
let w = send_window.load(Ordering::Acquire) as usize; let w = send_window.load(Ordering::Acquire) as usize;
let max_read = w.min(buf.len()); if w == 0 {
log::warn!("Stream {} download: window still 0 after stall timeout, closing", stream_id);
break;
}
// Adaptive: cap read to current per-stream target window
let adaptive_cap = remoteingress_protocol::compute_window_for_stream_count(
stream_counter.load(Ordering::Relaxed),
) as usize;
let max_read = w.min(buf.len()).min(adaptive_cap);
tokio::select! { tokio::select! {
read_result = up_read.read(&mut buf[..max_read]) => { read_result = up_read.read(&mut buf[..max_read]) => {
@@ -576,8 +636,8 @@ async fn handle_edge_connection(
send_window.fetch_sub(n as u32, Ordering::Release); send_window.fetch_sub(n as u32, Ordering::Release);
let frame = let frame =
encode_frame(stream_id, FRAME_DATA_BACK, &buf[..n]); encode_frame(stream_id, FRAME_DATA_BACK, &buf[..n]);
if writer_tx.send(frame).await.is_err() { if data_writer_tx.send(frame).await.is_err() {
log::warn!("Stream {} writer channel closed, closing", stream_id); log::warn!("Stream {} data channel closed, closing", stream_id);
break; break;
} }
} }
@@ -588,10 +648,11 @@ async fn handle_edge_connection(
} }
} }
// Send CLOSE_BACK to edge (only if not cancelled) // Send CLOSE_BACK via DATA channel (must arrive AFTER last DATA_BACK).
// Use send().await to guarantee delivery (try_send silently drops if full).
if !stream_token.is_cancelled() { if !stream_token.is_cancelled() {
let close_frame = encode_frame(stream_id, FRAME_CLOSE_BACK, &[]); let close_frame = encode_frame(stream_id, FRAME_CLOSE_BACK, &[]);
let _ = writer_tx.try_send(close_frame); let _ = data_writer_tx.send(close_frame).await;
} }
writer_for_edge_data.abort(); writer_for_edge_data.abort();
@@ -601,10 +662,11 @@ async fn handle_edge_connection(
if let Err(e) = result { if let Err(e) = result {
log::error!("Stream {} error: {}", stream_id, e); log::error!("Stream {} error: {}", stream_id, e);
// Send CLOSE_BACK on error (only if not cancelled) // Send CLOSE_BACK via DATA channel on error (must arrive after any DATA_BACK).
// Use send().await to guarantee delivery.
if !stream_token.is_cancelled() { if !stream_token.is_cancelled() {
let close_frame = encode_frame(stream_id, FRAME_CLOSE_BACK, &[]); let close_frame = encode_frame(stream_id, FRAME_CLOSE_BACK, &[]);
let _ = writer_tx.try_send(close_frame); let _ = data_writer_tx.send(close_frame).await;
} }
} }
@@ -619,6 +681,7 @@ async fn handle_edge_connection(
stream_id, stream_id,
}); });
} }
stream_counter.fetch_sub(1, Ordering::Relaxed);
}); });
} }
FRAME_DATA => { FRAME_DATA => {
@@ -680,8 +743,9 @@ async fn handle_edge_connection(
_ = ping_ticker.tick() => { _ = ping_ticker.tick() => {
let ping_frame = encode_frame(0, FRAME_PING, &[]); let ping_frame = encode_frame(0, FRAME_PING, &[]);
if frame_writer_tx.try_send(ping_frame).is_err() { if frame_writer_tx.try_send(ping_frame).is_err() {
log::warn!("Failed to send PING to edge {}, writer channel full/closed", edge_id); // Control channel full — skip this PING cycle.
break; // The 45s liveness timeout gives margin for the channel to drain.
log::warn!("PING send to edge {} failed, control channel full — skipping", edge_id);
} }
log::trace!("Sent PING to edge {}", edge_id); log::trace!("Sent PING to edge {}", edge_id);
} }

View File

@@ -32,6 +32,15 @@ pub fn encode_window_update(stream_id: u32, frame_type: u8, increment: u32) -> V
encode_frame(stream_id, frame_type, &increment.to_be_bytes()) encode_frame(stream_id, frame_type, &increment.to_be_bytes())
} }
/// Compute the target per-stream window size based on the number of active streams.
/// Total memory budget is ~32MB shared across all streams. As more streams are active,
/// each gets a smaller window. This adapts to current demand — few streams get high
/// throughput, many streams save memory and reduce control frame pressure.
pub fn compute_window_for_stream_count(active: u32) -> u32 {
let per_stream = (32 * 1024 * 1024u64) / (active.max(1) as u64);
per_stream.clamp(64 * 1024, INITIAL_STREAM_WINDOW as u64) as u32
}
/// Decode a WINDOW_UPDATE payload into a byte increment. Returns None if payload is malformed. /// Decode a WINDOW_UPDATE payload into a byte increment. Returns None if payload is malformed.
pub fn decode_window_update(payload: &[u8]) -> Option<u32> { pub fn decode_window_update(payload: &[u8]) -> Option<u32> {
if payload.len() != 4 { if payload.len() != 4 {

View File

@@ -3,6 +3,6 @@
*/ */
export const commitinfo = { export const commitinfo = {
name: '@serve.zone/remoteingress', name: '@serve.zone/remoteingress',
version: '4.5.2', version: '4.6.1',
description: 'Edge ingress tunnel for DcRouter - accepts incoming TCP connections at network edge and tunnels them to DcRouter SmartProxy preserving client IP via PROXY protocol v1.' description: 'Edge ingress tunnel for DcRouter - accepts incoming TCP connections at network edge and tunnels them to DcRouter SmartProxy preserving client IP via PROXY protocol v1.'
} }