Skip to content

Audio

BVE Gateway supports two audio endpoints: speech synthesis (TTS) and audio transcription (Whisper).

POST https://api.bve.me/v1/audio/speech

Requires Authorization: Bearer sk-bve-YOUR_KEY.

Converts text to audio. Returns a binary audio stream (audio/mpeg).

{
"model": "tts-1",
"input": "Hello, world!",
"voice": "alloy"
}
FieldTypeRequiredDescription
modelstringYestts-1 or tts-1-hd
inputstringYesText to synthesize (max 4096 characters)
voicestringYesVoice ID: alloy, echo, fable, onyx, nova, shimmer
response_formatstringNoAudio format: mp3 (default), opus, aac, flac
speednumberNoSpeed multiplier 0.25–4.0 (default 1.0)
Terminal window
curl https://api.bve.me/v1/audio/speech \
-H "Authorization: Bearer sk-bve-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "tts-1",
"input": "Hello from BVE Gateway!",
"voice": "alloy"
}' \
--output speech.mp3
import OpenAI from 'openai';
import fs from 'fs';
const client = new OpenAI({
apiKey: 'sk-bve-YOUR_KEY',
baseURL: 'https://api.bve.me/v1',
});
const mp3 = await client.audio.speech.create({
model: 'tts-1',
voice: 'alloy',
input: 'Hello from BVE Gateway!',
});
const buffer = Buffer.from(await mp3.arrayBuffer());
fs.writeFileSync('speech.mp3', buffer);
  • Response is a binary audio stream, not JSON.
  • tts-1-hd produces higher-quality audio at higher latency and cost.
  • The endpoint proxies directly to Fuelix — no buffering.

POST https://api.bve.me/v1/audio/transcriptions

Requires Authorization: Bearer sk-bve-YOUR_KEY.

Transcribes audio to text using Whisper. Accepts multipart/form-data.

FieldTypeRequiredDescription
filebinaryYesAudio file (mp3, mp4, mpeg, mpga, m4a, wav, webm)
modelstringYeswhisper-1, gpt-4o-transcribe, gpt-4o-transcribe-diarize
languagestringNoISO-639-1 language code (e.g. en)
promptstringNoOptional context/style prompt
response_formatstringNojson (default), text, srt, verbose_json, vtt
temperaturenumberNoSampling temperature 0–1
Terminal window
curl https://api.bve.me/v1/audio/transcriptions \
-H "Authorization: Bearer sk-bve-YOUR_KEY" \
-F file="@audio.mp3" \
-F model="whisper-1"

Response:

{
"text": "Hello, this is the transcribed text."
}
import OpenAI from 'openai';
import fs from 'fs';
const client = new OpenAI({
apiKey: 'sk-bve-YOUR_KEY',
baseURL: 'https://api.bve.me/v1',
});
const transcription = await client.audio.transcriptions.create({
file: fs.createReadStream('audio.mp3'),
model: 'whisper-1',
});
console.log(transcription.text);
ModelDescription
whisper-1Standard Whisper transcription
gpt-4o-transcribeGPT-4o-based transcription
gpt-4o-transcribe-diarizeGPT-4o transcription with speaker diarization
  • Audio translations (POST /v1/audio/translations) are not supported by Fuelix and return 404.
  • Maximum file size is 25 MB (OpenAI limit; Fuelix may enforce lower).
  • The multipart request is forwarded directly — no re-encoding.