Skip to content

Responses API

POST https://api.bve.me/v1/responses

Requires Authorization: Bearer sk-bve-YOUR_KEY.

The Responses API is OpenAI’s newer, stateful generation interface. BVE Gateway proxies this endpoint directly to Fuelix.

{
"model": "gpt-4o",
"input": "What is 2 + 2?",
"max_output_tokens": 100
}
FieldTypeRequiredDescription
modelstringYesGPT model ID (e.g. gpt-4o, gpt-4.1)
inputstring or arrayYesText prompt or message array
max_output_tokensintegerNoMax tokens (minimum 16)
temperaturenumberNoSampling temperature
top_pnumberNoNucleus sampling
streambooleanNoEnable SSE streaming
toolsarrayNoTool definitions
tool_choicestring or objectNoTool selection strategy
instructionsstringNoSystem instructions
previous_response_idstringNoFor multi-turn conversations
{
"id": "resp_abc123",
"object": "response",
"created_at": 1716288000,
"model": "gpt-4o-2024-11-20",
"output": [
{
"type": "message",
"role": "assistant",
"content": [
{ "type": "output_text", "text": "4" }
]
}
],
"usage": {
"input_tokens": 7,
"output_tokens": 1,
"total_tokens": 8
}
}
Terminal window
curl https://api.bve.me/v1/responses \
-H "Authorization: Bearer sk-bve-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"input": "What is 2 + 2?",
"max_output_tokens": 100
}'
Terminal window
# First turn
curl https://api.bve.me/v1/responses \
-H "Authorization: Bearer sk-bve-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"input": "My name is Alice.",
"max_output_tokens": 100
}'
# Second turn (reference the previous response)
curl https://api.bve.me/v1/responses \
-H "Authorization: Bearer sk-bve-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-4o",
"input": "What is my name?",
"previous_response_id": "resp_abc123",
"max_output_tokens": 100
}'
import OpenAI from 'openai';
const client = new OpenAI({
apiKey: 'sk-bve-YOUR_KEY',
baseURL: 'https://api.bve.me/v1',
});
const response = await client.responses.create({
model: 'gpt-4o',
input: 'What is 2 + 2?',
});
console.log(response.output_text);
  • Only GPT models are supported upstream (e.g. gpt-4o, gpt-4.1, gpt-5, o3, o4-mini).
  • For Claude or Gemini models, use /v1/chat/completions or /v1/messages instead.
  • The max_output_tokens must be at least 16.