Skip to content

Changelog

Confirmed surface (live probe against api.fuelix.ai/v1):

  • 103 models available including GPT-5.4, Claude Sonnet 4.6, Gemini 3.x, Llama 4, imagen-4
  • All previously documented endpoints remain supported
  • POST /responses confirmed working (requires max_output_tokens >= 16)
  • Anthropic Messages API (POST /messages) confirmed returning Anthropic-format responses

Docs added:

  • Audio (TTS + transcriptions)
  • Images (generations + edits)
  • Files API
  • Assistants API (assistants + threads + messages + runs)
  • Vector Stores
  • Anthropic Messages API
  • Responses API
  • Deployment Notes

Discovery: Live probing of https://api.fuelix.ai/v1 via scripts/fuelix-mega-discovery.ts.

Supported endpoints confirmed:

EndpointNotes
GET /modelsFull model list
GET /models/:idIndividual model details
POST /chat/completionsAll models; streaming SSE supported
POST /embeddingsfloat and base64 encoding, dimensions param
POST /audio/speechTTS; binary audio stream
POST /audio/transcriptionsWhisper; multipart/form-data
POST /images/generationsimagen-3, imagen-3-fast (not dall-e-3)
POST /images/editsmultipart/form-data
GET/POST/DELETE /files*Shared upstream account scope
GET/POST/DELETE /assistants*Assistants v2; shared scope
GET/POST/DELETE /threads*Full thread/message/run tree; shared scope
GET/POST/DELETE /vector_stores*Vector knowledge bases; shared scope
POST /responsesGPT models only; max_output_tokens >= 16
POST /messagesAnthropic Messages API; Anthropic-format response

Emulated:

EndpointNotes
POST /completionsEmulated via /chat/completions; streaming rejected with 400

Unsupported (returns 404):

  • POST /audio/translations
  • POST /images/variations
  • POST /moderations
  • GET /batches
  • GET /fine_tuning/jobs
  • POST /rerank
  • GET /usage
  • GET /realtime

Chat completion accepted params:

temperature, top_p, stop, presence_penalty, frequency_penalty, user, seed, n, response_format (text/json_object/json_schema), logprobs, top_logprobs, logit_bias, tools, tool_choice, stream, stream_options, reasoning_effort, thinking, service_tier, store, modalities, max_tokens, max_completion_tokens

Architecture implemented:

  • Cloudflare Workers (Hono routing)
  • Cloudflare D1 (SQLite) for API key storage and usage tracking
  • Cloudflare Durable Objects (ApiKeyLimiter) for per-key rate limiting
  • Cloudflare Queues (bve-gateway-events) for async audit events
  • SHA-256 + pepper key hashing (raw keys never stored)
  • Header allowlist filtering (internal Fuelix headers stripped)
  • Body size limit (10 MB)
  • JSONDecodeError normalization (Fuelix 500 → gateway 400)