SSE Streaming Proxy + Token Metering
Source: Custom
Topics: Server-Sent Events, http.Flusher, streaming, context cancellation, token metering
Problem
Stream chunks to a client as Server-Sent Events, flushing each one immediately, while
metering how many tokens were sent — the core of an LLM/AI gateway that proxies a model's
token stream and bills per token.
Requirements:
CountTokens(s string) int — count whitespace-delimited tokens in a chunk.
StreamSSE(w http.ResponseWriter, r *http.Request, chunks <-chan string) (tokens int, err error):
- Set SSE headers (
Content-Type: text/event-stream, Cache-Control: no-cache, Connection: keep-alive).
- For each chunk, write an SSE event (
data: <chunk>\n\n) and flush it so the client sees it immediately.
- Count tokens across all chunks; return the running total.
- Stop and return
ctx.Err() if the client disconnects (r.Context() is cancelled).
- Return
nil with the total when the chunks channel is closed.
- Return
ErrNoFlush if the ResponseWriter doesn't support flushing.
var ErrNoFlush = errors.New("streamproxy: ResponseWriter does not support flushing")
func CountTokens(s string) int
func StreamSSE(w http.ResponseWriter, r *http.Request, chunks <-chan string) (int, error)
Key concepts
- SSE framing: each event is
data: <payload>\n\n over a long-lived text/event-stream
response — simpler than WebSocket, one-directional, and what LLM APIs use for token streaming.
- http.Flusher: without
Flush(), Go buffers the response and the client sees nothing until
the handler returns — fatal for streaming. Flushing after each chunk is what makes it "live".
- Client-disconnect detection:
r.Context() is cancelled when the client goes away — a
streaming proxy must stop pulling from (and paying for) the upstream the moment that happens.
- Token metering: counting as you stream (not at the end) means you still bill correctly when
a stream is cancelled halfway through.
Run
go test -v -race ./challenges/networking/streaming-proxy/
Sign in to submit your solution.