Inference
Multi-provider chat completions API with OpenAI-compatible interface, billed from Conway credits.
Chat Completions
POST /v1/chat/completions
Multi-provider chat completions endpoint with an OpenAI-compatible interface. Requests are routed to the appropriate provider (OpenAI, Anthropic, Google, Moonshot, or Qwen) based on model name and billed from your Conway credits.
All responses are returned in OpenAI-compatible format regardless of the upstream provider.
Supports streaming via Server-Sent Events (SSE).
Prerequisites
- Authenticated with API key or JWT
- Minimum credit balance of 10 cents
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | Model name (e.g. gpt-5.2, claude-sonnet-4.5, gemini-2.5-pro, kimi-k2.5) |
messages | array | Yes | Array of message objects ({ role, content }) |
stream | boolean | No | Enable SSE streaming (default: false) |
temperature | number | No | Sampling temperature |
max_tokens | number | No | Maximum tokens to generate |
All other OpenAI-compatible parameters (tools, tool_choice, top_p, stop, etc.) are forwarded or translated as needed.
Example — OpenAI
curl -X POST https://inference.conway.tech/v1/chat/completions \
-H "Authorization: Bearer cnwy_k_your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gpt-5.2",
"messages": [
{ "role": "user", "content": "Hello" }
]
}'Example — Anthropic
curl -X POST https://inference.conway.tech/v1/chat/completions \
-H "Authorization: Bearer cnwy_k_your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4.5",
"messages": [
{ "role": "user", "content": "Hello" }
],
"max_tokens": 100
}'Example — Google Gemini
curl -X POST https://inference.conway.tech/v1/chat/completions \
-H "Authorization: Bearer cnwy_k_your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "gemini-2.5-pro",
"messages": [
{ "role": "user", "content": "Hello" }
]
}'Example — Kimi
curl -X POST https://inference.conway.tech/v1/chat/completions \
-H "Authorization: Bearer cnwy_k_your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model": "kimi-k2.5",
"messages": [
{ "role": "user", "content": "Hello" }
]
}'Response Format
All providers return responses in OpenAI-compatible format:
{
"id": "chatcmpl-abc123",
"object": "chat.completion",
"model": "gpt-4o-mini-2024-07-18",
"choices": [
{
"index": 0,
"message": { "role": "assistant", "content": "Hello! How can I help?" },
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 8,
"completion_tokens": 7,
"total_tokens": 15
}
}Streaming
Add "stream": true to the request body. Returns a stream of data: lines in SSE format, ending with data: [DONE]. Streaming is supported for all providers.
Billing
Each request is billed based on token usage and the selected model's pricing.
- Credits are deducted after the response completes
- Transactions appear in your credit history as type
inference
Errors
| Status | Description |
|---|---|
400 | Missing model or messages |
401 | Invalid or missing authentication |
402 | Insufficient credits (minimum 10 cents required) |
503 | Inference proxy not configured (missing API key for requested provider) |
Supported Models
OpenAI
gpt-5.2,gpt-5.2-codexgpt-5-mini,gpt-5-nano
Anthropic
claude-opus-4.6,claude-opus-4.5claude-sonnet-4.5claude-haiku-4.5
Google Gemini
gemini-2.5-pro,gemini-2.5-flashgemini-3-pro,gemini-3-flash
Moonshot (Kimi)
kimi-k2.5
Qwen
qwen3-coder