Audio
BVE Gateway supports two audio endpoints: speech synthesis (TTS) and audio transcription (Whisper).
Text-to-Speech
Section titled “Text-to-Speech”POST https://api.bve.me/v1/audio/speechRequires Authorization: Bearer sk-bve-YOUR_KEY.
Converts text to audio. Returns a binary audio stream (audio/mpeg).
Request body
Section titled “Request body”{ "model": "tts-1", "input": "Hello, world!", "voice": "alloy"}| Field | Type | Required | Description |
|---|---|---|---|
model | string | Yes | tts-1 or tts-1-hd |
input | string | Yes | Text to synthesize (max 4096 characters) |
voice | string | Yes | Voice ID: alloy, echo, fable, onyx, nova, shimmer |
response_format | string | No | Audio format: mp3 (default), opus, aac, flac |
speed | number | No | Speed multiplier 0.25–4.0 (default 1.0) |
cURL example
Section titled “cURL example”curl https://api.bve.me/v1/audio/speech \ -H "Authorization: Bearer sk-bve-YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "tts-1", "input": "Hello from BVE Gateway!", "voice": "alloy" }' \ --output speech.mp3OpenAI SDK
Section titled “OpenAI SDK”import OpenAI from 'openai';import fs from 'fs';
const client = new OpenAI({ apiKey: 'sk-bve-YOUR_KEY', baseURL: 'https://api.bve.me/v1',});
const mp3 = await client.audio.speech.create({ model: 'tts-1', voice: 'alloy', input: 'Hello from BVE Gateway!',});
const buffer = Buffer.from(await mp3.arrayBuffer());fs.writeFileSync('speech.mp3', buffer);- Response is a binary audio stream, not JSON.
tts-1-hdproduces higher-quality audio at higher latency and cost.- The endpoint proxies directly to Fuelix — no buffering.
Audio Transcriptions
Section titled “Audio Transcriptions”POST https://api.bve.me/v1/audio/transcriptionsRequires Authorization: Bearer sk-bve-YOUR_KEY.
Transcribes audio to text using Whisper. Accepts multipart/form-data.
Request (multipart/form-data)
Section titled “Request (multipart/form-data)”| Field | Type | Required | Description |
|---|---|---|---|
file | binary | Yes | Audio file (mp3, mp4, mpeg, mpga, m4a, wav, webm) |
model | string | Yes | whisper-1, gpt-4o-transcribe, gpt-4o-transcribe-diarize |
language | string | No | ISO-639-1 language code (e.g. en) |
prompt | string | No | Optional context/style prompt |
response_format | string | No | json (default), text, srt, verbose_json, vtt |
temperature | number | No | Sampling temperature 0–1 |
cURL example
Section titled “cURL example”curl https://api.bve.me/v1/audio/transcriptions \ -H "Authorization: Bearer sk-bve-YOUR_KEY" \ -F file="@audio.mp3" \ -F model="whisper-1"Response:
{ "text": "Hello, this is the transcribed text."}OpenAI SDK
Section titled “OpenAI SDK”import OpenAI from 'openai';import fs from 'fs';
const client = new OpenAI({ apiKey: 'sk-bve-YOUR_KEY', baseURL: 'https://api.bve.me/v1',});
const transcription = await client.audio.transcriptions.create({ file: fs.createReadStream('audio.mp3'), model: 'whisper-1',});
console.log(transcription.text);Available transcription models
Section titled “Available transcription models”| Model | Description |
|---|---|
whisper-1 | Standard Whisper transcription |
gpt-4o-transcribe | GPT-4o-based transcription |
gpt-4o-transcribe-diarize | GPT-4o transcription with speaker diarization |
- Audio translations (
POST /v1/audio/translations) are not supported by Fuelix and return 404. - Maximum file size is 25 MB (OpenAI limit; Fuelix may enforce lower).
- The multipart request is forwarded directly — no re-encoding.