Streaming
BVE Gateway supports streaming responses via Server-Sent Events (SSE) for /v1/chat/completions.
How streaming works
Section titled “How streaming works”When "stream": true is set in the request body, the Fuelix response is piped directly to the client without buffering. BVE Gateway:
- Does not call
response.json()orresponse.text()on streaming responses - Does not buffer the response body
- Preserves the upstream
Content-Type(typicallytext/event-stream) - Preserves the upstream status code
- Preserves SSE format exactly as Fuelix sends it
Only the X-BVE-Latency header is added; all other BVE-specific headers come from the safe header allowlist.
SSE format
Section titled “SSE format”Each streamed chunk is a data-only SSE event:
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1716288000,"model":"gpt-4o","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl-abc","object":"chat.completion.chunk","created":1716288000,"model":"gpt-4o","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]cURL example
Section titled “cURL example”curl https://api.bve.me/v1/chat/completions \ -H "Authorization: Bearer sk-bve-YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "gpt-4o", "messages": [{ "role": "user", "content": "Count from 1 to 5." }], "stream": true }'OpenAI SDK streaming
Section titled “OpenAI SDK streaming”import OpenAI from 'openai';
const client = new OpenAI({ apiKey: 'sk-bve-YOUR_KEY', baseURL: 'https://api.bve.me/v1',});
const stream = await client.chat.completions.stream({ model: 'gpt-4o', messages: [{ role: 'user', content: 'Count from 1 to 5.' }],});
for await (const chunk of stream) { const text = chunk.choices[0]?.delta?.content ?? ''; process.stdout.write(text);}Safe headers on streaming responses
Section titled “Safe headers on streaming responses”The same header allowlist applies to streaming responses:
| Header | Forwarded |
|---|---|
content-type | Yes |
cache-control | Yes |
x-request-id | Yes |
x-quota-allowed | Yes |
x-quota-available | Yes |
x-quota-reset | Yes |
X-BVE-Latency | Added by gateway |
| All others | Stripped |