{"openapi": "3.1.0", "info": {"title": "inference.club API", "version": "1.0.0", "description": "The inference.club API is **OpenAI-compatible**. Point any OpenAI client at\nthe base URL below and it just works \u2014 no other changes.\n\n```python\nfrom openai import OpenAI\nclient = OpenAI(base_url=\"https://api.inference.club/v1\", api_key=\"<your-api-key>\")\n```\n\nRequests are proxied to a member's home GPU that serves the requested model.\nGet an API key from **Dashboard \u2192 Settings \u2192 Token**.\n\nEach model on `GET /models` reports a `service_type` (`llm`, `stt`,\n`image`, or `tts`) plus its modalities, and requests only route to a\nmatching service \u2014 a transcription never lands on a chat model, an image\nrequest never on a text model.\n\n> **Try it out** makes real calls against a live provider's GPU using your\n> API key (click **Authorize**). They count against your rate limit.\n", "contact": {"name": "inference.club", "url": "https://inference.club"}}, "servers": [{"url": "https://api.inference.club/v1", "description": "Production"}], "security": [{"bearerAuth": []}], "tags": [{"name": "Models", "description": "Discover the models you can use."}, {"name": "Chat", "description": "Text generation (LLM) \u2014 OpenAI chat & completions."}, {"name": "Audio", "description": "Speech-to-text (STT) and text-to-speech (TTS)."}, {"name": "Images", "description": "Image generation and editing."}], "paths": {"/models": {"get": {"tags": ["Models"], "summary": "List available models", "description": "Every model you can reach \u2014 your own providers' models plus shared\nservices on the network you have access to. OpenAI-compatible, with\nextra capability fields (`input_modalities`, `output_modalities`,\n`supported_features`, `service_type`).\n", "operationId": "listModels", "responses": {"200": {"description": "The list of models.", "headers": {"X-RateLimit-Limit": {"$ref": "#/components/headers/RateLimitLimit"}, "X-RateLimit-Remaining": {"$ref": "#/components/headers/RateLimitRemaining"}}, "content": {"application/json": {"schema": {"type": "object", "properties": {"object": {"type": "string", "example": "list"}, "data": {"type": "array", "items": {"$ref": "#/components/schemas/Model"}}}}, "example": {"object": "list", "data": [{"id": "qwen/qwen3-30b-a3b", "object": "model", "owned_by": "brian-home", "service_type": "llm", "input_modalities": ["text"], "output_modalities": ["text"], "supported_features": ["reasoning"], "context_length": 32768}, {"id": "qwen/qwen3-asr-1.7b", "object": "model", "owned_by": "brian-home", "service_type": "stt", "input_modalities": ["audio"], "output_modalities": ["text"]}]}}}}, "401": {"$ref": "#/components/responses/Unauthorized"}}}}, "/chat/completions": {"post": {"tags": ["Chat"], "summary": "Create a chat completion", "description": "OpenAI chat completions. Set `stream: true` for a Server-Sent Events\nstream of delta chunks ending with `data: [DONE]`. Models that support\nvision/audio input accept multimodal `content` parts.\n", "operationId": "createChatCompletion", "requestBody": {"required": true, "content": {"application/json": {"schema": {"$ref": "#/components/schemas/ChatCompletionRequest"}, "example": {"model": "qwen/qwen3-30b-a3b", "messages": [{"role": "system", "content": "You are concise."}, {"role": "user", "content": "Explain inference.club in one sentence."}], "stream": false}}}}, "responses": {"200": {"description": "A chat completion (or an SSE stream when `stream: true`).", "headers": {"X-RateLimit-Limit": {"$ref": "#/components/headers/RateLimitLimit"}, "X-RateLimit-Remaining": {"$ref": "#/components/headers/RateLimitRemaining"}}, "content": {"application/json": {"schema": {"$ref": "#/components/schemas/ChatCompletion"}}, "text/event-stream": {"schema": {"type": "string", "description": "SSE stream of chat.completion.chunk objects."}}}}, "401": {"$ref": "#/components/responses/Unauthorized"}, "404": {"$ref": "#/components/responses/NoProvider"}, "413": {"$ref": "#/components/responses/TooLarge"}, "502": {"$ref": "#/components/responses/UpstreamError"}}}}, "/completions": {"post": {"tags": ["Chat"], "summary": "Create a completion (legacy)", "description": "OpenAI legacy text completions. Prefer `/chat/completions`.", "operationId": "createCompletion", "requestBody": {"required": true, "content": {"application/json": {"schema": {"type": "object", "required": ["model", "prompt"], "properties": {"model": {"type": "string"}, "prompt": {"type": "string"}, "stream": {"type": "boolean", "default": false}, "max_tokens": {"type": "integer"}, "temperature": {"type": "number"}}, "additionalProperties": true}, "example": {"model": "qwen/qwen3-30b-a3b", "prompt": "Once upon a time"}}}}, "responses": {"200": {"description": "A text completion.", "content": {"application/json": {"schema": {"type": "object"}}}}, "401": {"$ref": "#/components/responses/Unauthorized"}, "404": {"$ref": "#/components/responses/NoProvider"}}}}, "/audio/transcriptions": {"post": {"tags": ["Audio"], "summary": "Transcribe audio to text", "description": "Speech-to-text. `multipart/form-data` with an audio `file`. Request\n`response_format=verbose_json` with `timestamp_granularities[]` for\nword/segment timings **when the model supports it** (see the model's\n`supported_features`); otherwise it's transparently downgraded to plain\ntext. Audio is metered by duration (`usage.seconds`).\n", "operationId": "createTranscription", "requestBody": {"required": true, "content": {"multipart/form-data": {"schema": {"type": "object", "required": ["file", "model"], "properties": {"file": {"type": "string", "format": "binary", "description": "The audio file (wav, mp3, m4a, flac, ogg, webm). Up to 25 MB."}, "model": {"type": "string", "description": "An `stt` model id from /models."}, "language": {"type": "string", "description": "Optional ISO-639-1 hint, e.g. `en`."}, "prompt": {"type": "string", "description": "Optional decoding hint (names, jargon)."}, "response_format": {"type": "string", "enum": ["json", "text", "verbose_json"], "default": "json"}, "timestamp_granularities[]": {"type": "array", "items": {"type": "string", "enum": ["word", "segment"]}}}}}}}, "responses": {"200": {"description": "The transcription.", "content": {"application/json": {"schema": {"$ref": "#/components/schemas/Transcription"}, "example": {"text": "Hey, this is a demo of the new model.", "usage": {"type": "duration", "seconds": 10}}}}}, "400": {"$ref": "#/components/responses/BadRequest"}, "401": {"$ref": "#/components/responses/Unauthorized"}, "404": {"$ref": "#/components/responses/NoProvider"}, "413": {"$ref": "#/components/responses/TooLarge"}, "415": {"$ref": "#/components/responses/UnsupportedMedia"}}}}, "/audio/speech": {"post": {"tags": ["Audio"], "summary": "Synthesize speech from text", "description": "Text-to-speech. Returns the **raw audio** (WAV by default, or Opus),\nexactly like OpenAI. A copy is stored on inference.club for your\nhistory. Metered by audio duration.\n", "operationId": "createSpeech", "requestBody": {"required": true, "content": {"application/json": {"schema": {"type": "object", "required": ["model", "input"], "properties": {"model": {"type": "string", "description": "A `tts` model id from /models."}, "input": {"type": "string", "description": "The text to synthesize."}, "voice": {"type": "string", "description": "A voice name (see /audio/voices)."}, "response_format": {"type": "string", "enum": ["wav", "opus"], "default": "wav"}, "language": {"type": "string", "description": "Optional language hint, e.g. en-US."}}}, "example": {"model": "magpie-tts-multilingual", "input": "Hello from inference club", "voice": "Magpie-Multilingual.EN-US.Mia"}}}}, "responses": {"200": {"description": "The synthesized audio bytes.", "content": {"audio/wav": {"schema": {"type": "string", "format": "binary"}}, "audio/ogg": {"schema": {"type": "string", "format": "binary"}}}}, "400": {"$ref": "#/components/responses/BadRequest"}, "401": {"$ref": "#/components/responses/Unauthorized"}, "404": {"$ref": "#/components/responses/NoProvider"}, "413": {"$ref": "#/components/responses/TooLarge"}}}}, "/audio/voices": {"get": {"tags": ["Audio"], "summary": "List a TTS model's voices", "description": "The voices a text-to-speech model offers. An inference.club extension\n(not part of OpenAI's API).\n", "operationId": "listVoices", "parameters": [{"name": "model", "in": "query", "required": true, "schema": {"type": "string"}, "description": "A `tts` model id."}], "responses": {"200": {"description": "The available voices.", "content": {"application/json": {"schema": {"type": "object", "properties": {"voices": {"type": "array", "items": {"type": "string"}}}}, "example": {"voices": ["Magpie-Multilingual.EN-US.Mia", "Magpie-Multilingual.EN-US.Jason"]}}}}, "401": {"$ref": "#/components/responses/Unauthorized"}, "404": {"$ref": "#/components/responses/NoProvider"}}}}, "/images/generations": {"post": {"tags": ["Images"], "summary": "Generate images from a prompt", "description": "Text-to-image. The image is stored on inference.club; by default the\nresponse returns a `url` you can use directly in an `<img>` tag. Set\n`response_format: b64_json` to also get the bytes inline. Metered by\nimage count.\n", "operationId": "createImage", "requestBody": {"required": true, "content": {"application/json": {"schema": {"type": "object", "required": ["model", "prompt"], "properties": {"model": {"type": "string", "description": "An `image` model id from /models."}, "prompt": {"type": "string"}, "n": {"type": "integer", "default": 1, "description": "Number of images (clamped server-side)."}, "size": {"type": "string", "example": "1024x1024"}, "response_format": {"type": "string", "enum": ["url", "b64_json"], "default": "url"}}}, "example": {"model": "flux-2-klein", "prompt": "a watercolor fox in a misty forest", "size": "1024x1024"}}}}, "responses": {"200": {"description": "The generated image(s).", "content": {"application/json": {"schema": {"$ref": "#/components/schemas/ImageResponse"}, "example": {"created": 1780266332, "data": [{"url": "https://api.inference.club/api/inference/assets/42/"}]}}}}, "400": {"$ref": "#/components/responses/BadRequest"}, "401": {"$ref": "#/components/responses/Unauthorized"}, "404": {"$ref": "#/components/responses/NoProvider"}}}}, "/images/edits": {"post": {"tags": ["Images"], "summary": "Edit an image with a prompt", "description": "Image + prompt \u2192 edited image. `multipart/form-data`. Same response\nshape as generations.\n", "operationId": "createImageEdit", "requestBody": {"required": true, "content": {"multipart/form-data": {"schema": {"type": "object", "required": ["image", "prompt", "model"], "properties": {"image": {"type": "string", "format": "binary", "description": "Source image (png, jpeg, webp). Up to 25 MB."}, "prompt": {"type": "string"}, "model": {"type": "string"}, "mask": {"type": "string", "format": "binary", "description": "Optional transparency mask."}, "n": {"type": "integer"}, "size": {"type": "string"}, "response_format": {"type": "string", "enum": ["url", "b64_json"], "default": "url"}}}}}}, "responses": {"200": {"description": "The edited image(s).", "content": {"application/json": {"schema": {"$ref": "#/components/schemas/ImageResponse"}}}}, "400": {"$ref": "#/components/responses/BadRequest"}, "401": {"$ref": "#/components/responses/Unauthorized"}, "404": {"$ref": "#/components/responses/NoProvider"}, "415": {"$ref": "#/components/responses/UnsupportedMedia"}}}}}, "components": {"securitySchemes": {"bearerAuth": {"type": "http", "scheme": "bearer", "description": "Your API key from Dashboard \u2192 Settings \u2192 Token, sent as `Authorization: Bearer <key>`."}}, "headers": {"RateLimitLimit": {"description": "Requests allowed in the current window.", "schema": {"type": "integer"}}, "RateLimitRemaining": {"description": "Requests remaining in the current window.", "schema": {"type": "integer"}}}, "responses": {"Unauthorized": {"description": "Missing or invalid API key.", "content": {"application/json": {"schema": {"$ref": "#/components/schemas/Error"}, "example": {"error": {"message": "Authentication credentials were not provided.", "type": "not_authenticated"}}}}}, "NoProvider": {"description": "No online provider serves the requested model for you.", "content": {"application/json": {"schema": {"$ref": "#/components/schemas/Error"}, "example": {"error": {"message": "No online provider serving model 'x' for this user.", "type": "no_provider"}}}}}, "BadRequest": {"description": "The request was malformed (e.g. missing a required field).", "content": {"application/json": {"schema": {"$ref": "#/components/schemas/Error"}}}}, "TooLarge": {"description": "The request exceeded a size limit.", "content": {"application/json": {"schema": {"$ref": "#/components/schemas/Error"}}}}, "UnsupportedMedia": {"description": "The uploaded file type isn't supported.", "content": {"application/json": {"schema": {"$ref": "#/components/schemas/Error"}}}}, "UpstreamError": {"description": "The provider's local server failed or didn't respond.", "content": {"application/json": {"schema": {"$ref": "#/components/schemas/Error"}}}}}, "schemas": {"Error": {"type": "object", "properties": {"error": {"type": "object", "properties": {"message": {"type": "string"}, "type": {"type": "string"}}}}}, "Model": {"type": "object", "properties": {"id": {"type": "string"}, "object": {"type": "string", "example": "model"}, "created": {"type": "integer"}, "owned_by": {"type": "string"}, "service_type": {"type": "string", "enum": ["llm", "stt", "image", "tts"], "description": "Which surface this model serves."}, "input_modalities": {"type": "array", "items": {"type": "string"}, "example": ["text"]}, "output_modalities": {"type": "array", "items": {"type": "string"}, "example": ["text"]}, "supported_features": {"type": "array", "items": {"type": "string"}, "example": ["reasoning"]}, "context_length": {"type": "integer", "nullable": true}}}, "ChatMessage": {"type": "object", "required": ["role", "content"], "properties": {"role": {"type": "string", "enum": ["system", "user", "assistant"]}, "content": {"description": "A string, or an array of multimodal parts for vision/audio models.", "oneOf": [{"type": "string"}, {"type": "array", "items": {"type": "object"}}]}}}, "ChatCompletionRequest": {"type": "object", "required": ["model", "messages"], "properties": {"model": {"type": "string"}, "messages": {"type": "array", "items": {"$ref": "#/components/schemas/ChatMessage"}}, "stream": {"type": "boolean", "default": false}, "temperature": {"type": "number", "default": 0.7}, "top_p": {"type": "number"}, "max_tokens": {"type": "integer"}, "frequency_penalty": {"type": "number"}, "presence_penalty": {"type": "number"}, "stop": {"oneOf": [{"type": "string"}, {"type": "array", "items": {"type": "string"}}]}}, "additionalProperties": true}, "ChatCompletion": {"type": "object", "properties": {"id": {"type": "string"}, "object": {"type": "string", "example": "chat.completion"}, "model": {"type": "string"}, "choices": {"type": "array", "items": {"type": "object", "properties": {"index": {"type": "integer"}, "message": {"$ref": "#/components/schemas/ChatMessage"}, "finish_reason": {"type": "string"}}}}, "usage": {"type": "object", "properties": {"prompt_tokens": {"type": "integer"}, "completion_tokens": {"type": "integer"}, "total_tokens": {"type": "integer"}}}}}, "Transcription": {"type": "object", "properties": {"text": {"type": "string"}, "language": {"type": "string"}, "duration": {"type": "number"}, "words": {"type": "array", "items": {"type": "object", "properties": {"word": {"type": "string"}, "start": {"type": "number"}, "end": {"type": "number"}}}}, "segments": {"type": "array", "items": {"type": "object"}}, "usage": {"type": "object", "properties": {"type": {"type": "string", "example": "duration"}, "seconds": {"type": "number"}}}}}, "ImageResponse": {"type": "object", "properties": {"created": {"type": "integer"}, "data": {"type": "array", "items": {"type": "object", "properties": {"url": {"type": "string", "description": "inference.club asset URL (default)."}, "b64_json": {"type": "string", "description": "Base64 image bytes (when response_format=b64_json)."}, "revised_prompt": {"type": "string"}}}}}}}}}