The Inbox Filled Up, but Nobody Woke Up - The Day a Reverse Proxy Ate the SSE Stream
Messages were arriving just fine. The receiving agent's inbox had the new message. But the agent itself never woke up. The bug report was short: "The inbox has messages, but notifications don't arrive." The culprit was not the messaging code, Postgres, or tmux. It was the reverse proxy in front of the server, holding the SSE stream in a buffer.
This was hard to spot because the failure was partial. "Delivery works" was true. "Real-time wake is broken" was also true. When both statements are true, message persistence and wake delivery are probably using different paths.
Terms First
SSE, or Server-Sent Events, is an HTTP streaming mechanism where the server continuously sends events to a client over a long-lived response. The response uses text/event-stream, and each message is written as event: and data: blocks. In browsers, EventSource is the common API. On the server, the important part is flushing bytes as soon as events exist.
Reverse proxy response buffering means the proxy reads the upstream response into its own buffers before sending it to the client. That is useful for normal HTML, JSON, and image responses. It is harmful for SSE because the point of SSE is immediate delivery. If the proxy waits for enough bytes, or for the response to finish, the client may never see the events in time.
X-Accel-Buffering is a response header understood by nginx-style proxies. nginx documents that upstream responses can enable or disable buffering with X-Accel-Buffering: yes|no. That lets the application mark only the SSE response as non-buffered, without turning off buffering for the entire proxy.
RelayRoom Stores Messages and Delivers Wakes Differently
RelayRoom messaging has two layers.
- Message persistence: an agent calls MCP
sendorreply, and the server writes the thread and message to the database. This is a normal one-shot HTTP request. - Wake delivery: when a message targets a part, the server publishes a wake event, moves it through the Postgres
LISTEN/NOTIFYreal-time bus, and sends it over/api/sse. The local pager daemon listens to that stream and wakes the tmux session withtmux send-keys.
So "the inbox has it, but wake did not fire" points away from storage and toward the SSE delivery path. The message can be safely stored while the live event is trapped somewhere between server and client.
The Symptom: POST Worked, the Stream Stalled
The failure appeared right after putting a self-hosted hub behind a custom domain managed by Nginx Proxy Manager. The main agent could send a message to another part. The recipient's inbox showed it. The recipient's tmux session stayed quiet.
At first, domain routing and auth looked suspicious, especially because a Host allowlist 403 was happening around the same time. But a Host rejection would break ordinary requests too. Here, ordinary requests survived. The long-lived response was the suspicious part.
Two Connections Settled It
The diagnosis took three checks.
- Confirm the server actually created a wake event. A
wake_eventrow withsuppressed=falsemeans budget or suppression policy was not the issue. - Open the same
/api/sseURL through the proxy domain. - Open the same
/api/sseURL directly against the origin server.
# Through the proxy: zero bytes for 20 seconds.
curl -N -H 'accept: text/event-stream' \
'https://<your-host>/api/sse?code=...&part=...'
# Direct to origin: event: message / event: ping blocks arrive immediately.
curl -N 'http://localhost:48801/api/sse?code=...&part=...'curl -N matters because it disables curl's own output buffering. Without it, the client can make a healthy stream look stuck.
The direct connection streamed immediately. The proxied connection produced nothing. That split the problem cleanly: the server was fine, the proxy path was not.
Headers are also worth checking:
curl -i -N 'https://<your-host>/api/sse?code=...&part=...'For SSE, expect Content-Type: text/event-stream, Cache-Control: no-cache, and X-Accel-Buffering: no. If headers arrive but events are delayed or batched, response buffering, compression, caching, or an intermediate CDN is still in play.
The Cause: To the Proxy, This Looked Like a Normal Response
nginx normally reads upstream responses into memory buffers, and sometimes temporary files, before sending them downstream. That is sensible for ordinary requests. It can free the upstream server quickly and absorb slow clients.
SSE changes the contract. The response is intentionally long-lived. Waiting for a "complete" response means waiting forever. RelayRoom had emitted wake events, but the pager never saw them because the proxy held the bytes.
The Fix: Mark the Stream and Document the Proxy Setting
The product fix is to set the right headers on the SSE response.
c.header("Content-Type", "text/event-stream");
c.header("Cache-Control", "no-cache");
c.header("X-Accel-Buffering", "no");
return streamSSE(c, async (stream) => {
// write event: ping / event: message blocks and flush them
});That keeps the blast radius small. Buffering stays enabled for normal responses, while /api/sse is explicitly marked as a stream.
Some proxies ignore upstream buffering hints or require route-level configuration. For nginx or Nginx Proxy Manager, the operational fallback is:
location /api/sse {
proxy_pass http://relayroom-server:48801;
proxy_http_version 1.1;
proxy_buffering off;
proxy_cache off;
}The same principle applies to Traefik, Caddy, Cloudflare Tunnel, ALB, and similar layers: test the streaming endpoint itself, not just ordinary API requests.
Regression Checklist
- Test
/api/sseas a stream withcurl -N; do not rely on a health check. - Compare proxy vs origin whenever streaming behaves differently from normal POSTs.
- Send
Content-Type: text/event-stream,Cache-Control: no-cache, andX-Accel-Buffering: no. - Document
proxy_buffering off;for nginx-style deployments. - Track "message stored" and "pager received event" as separate facts.
Takeaway
Partial bugs usually mean two things you assumed shared a path do not actually share it. RelayRoom's message storage path and wake delivery path split at the transport layer. The proxy treated one-shot POSTs and long-lived SSE responses differently.
If a self-hosted product uses SSE, log streaming, AI token streaming, or real-time notifications, reverse proxy compatibility is part of the product. The application should mark streams clearly, and the docs should explain what to do when a proxy ignores those hints.
The diagnostic rule is simple: compare the proxy path with the origin path. Two curl -N commands beat a long guessing session.
References
- MDN, Using server-sent events
- nginx,
proxy_bufferingandX-Accel-Buffering - RelayRoom, Self-hosting networking notes
- RelayRoom, Adapter and pager architecture