Conway

Inference

Multi-provider chat completions API with OpenAI-compatible interface, billed from Conway credits.

Chat Completions

POST /v1/chat/completions

Multi-provider chat completions endpoint with an OpenAI-compatible interface. Requests are routed to the appropriate provider (OpenAI, Anthropic, Google, Moonshot, or Qwen) based on model name and billed from your Conway credits with a 1.3x markup on token cost.

All responses are returned in OpenAI-compatible format regardless of the upstream provider.

Supports streaming via Server-Sent Events (SSE).

Prerequisites

  • Authenticated with API key or JWT
  • Minimum credit balance of 10 cents

Request Body

ParameterTypeRequiredDescription
modelstringYesModel name (e.g. gpt-5.2, claude-sonnet-4.5, gemini-2.5-pro, kimi-k2.5)
messagesarrayYesArray of message objects ({ role, content })
streambooleanNoEnable SSE streaming (default: false)
temperaturenumberNoSampling temperature
max_tokensnumberNoMaximum tokens to generate

All other OpenAI-compatible parameters (tools, tool_choice, top_p, stop, etc.) are forwarded or translated as needed.

Example — OpenAI

curl -X POST https://inference.conway.tech/v1/chat/completions \
  -H "Authorization: Bearer cnwy_k_your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.2",
    "messages": [
      { "role": "user", "content": "Hello" }
    ]
  }'

Example — Anthropic

curl -X POST https://inference.conway.tech/v1/chat/completions \
  -H "Authorization: Bearer cnwy_k_your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4.5",
    "messages": [
      { "role": "user", "content": "Hello" }
    ],
    "max_tokens": 100
  }'

Example — Google Gemini

curl -X POST https://inference.conway.tech/v1/chat/completions \
  -H "Authorization: Bearer cnwy_k_your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gemini-2.5-pro",
    "messages": [
      { "role": "user", "content": "Hello" }
    ]
  }'

Example — Kimi

curl -X POST https://inference.conway.tech/v1/chat/completions \
  -H "Authorization: Bearer cnwy_k_your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "kimi-k2.5",
    "messages": [
      { "role": "user", "content": "Hello" }
    ]
  }'

Response Format

All providers return responses in OpenAI-compatible format:

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "model": "gpt-4o-mini-2024-07-18",
  "choices": [
    {
      "index": 0,
      "message": { "role": "assistant", "content": "Hello! How can I help?" },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 8,
    "completion_tokens": 7,
    "total_tokens": 15
  }
}

Streaming

Add "stream": true to the request body. Returns a stream of data: lines in SSE format, ending with data: [DONE]. Streaming is supported for all providers.

Billing

Each request is billed based on token usage:

charged_cents = ceil(token_cost_usd * 100 * 1.3)
  • Token cost is computed from per-model pricing (input + output tokens)
  • A 1.3x markup is applied
  • Credits are deducted after the response completes
  • Transactions appear in your credit history as type inference

Errors

StatusDescription
400Missing model or messages
401Invalid or missing authentication
402Insufficient credits (minimum 10 cents required)
503Inference proxy not configured (missing API key for requested provider)

Supported Models

OpenAI

  • gpt-5.2, gpt-5.2-codex
  • gpt-5-mini, gpt-5-nano

Anthropic

  • claude-opus-4.6, claude-opus-4.5
  • claude-sonnet-4.5
  • claude-haiku-4.5

Google Gemini

  • gemini-2.5-pro, gemini-2.5-flash
  • gemini-3-pro, gemini-3-flash

Moonshot (Kimi)

  • kimi-k2.5

Qwen

  • qwen3-coder