Chat Completions

POST /v1/chat/completions

Send a list of conversation messages to a model and get a generated reply. This is CrossModel's core endpoint — it supports multi-turn chat, streaming, image input, and function/tool calling.

This endpoint is fully compatible with OpenAI Chat Completions. If you already have a project on the OpenAI SDK, just point base_url at CrossModel and set model to a CrossModel model ID — nothing else needs to change.

Endpoint

POST https://api.crossmodel.ai/v1/chat/completions
Authorization: Bearer cm-YOUR_KEY
Content-Type: application/json

Every request must carry a Bearer API key in the Authorization header. Create and manage keys on the console's API Keys page.

Create chat completion

1curl https://api.crossmodel.ai/v1/chat/completions \2  -H "Content-Type: application/json" \3  -H "Authorization: Bearer $CROSSMODEL_API_KEY" \4  -d '{5    "model": "deepseek/deepseek-v4-pro",6    "messages": [7      {8        "role": "system",9        "content": "You are a helpful assistant."10      },11      {12        "role": "user",13        "content": "Hello!"14      }15    ]16  }'

Request parameters

Parameter	Type	Required	Notes
`model`	string	Yes	The model ID to use, e.g. `deepseek/deepseek-v4-pro`. Call `/v1/models` for the full list.
`messages`	array	Yes	The conversation so far, in chronological order, with at least one entry. See Message object below.
`stream`	boolean	No	Whether to stream the result. When `true`, the reply is pushed as Server-Sent Events. Default `false`.
`stream_options`	object	No	Extra streaming options; supports `include_usage`. Only applies when `stream` is `true`.
`max_completion_tokens`	integer	No	Max tokens to generate for this reply. Takes precedence over `max_tokens` when both are present.
`max_tokens`	integer	No	Max tokens to generate for this reply. Legacy parameter — new code should use `max_completion_tokens`.
`temperature`	number	No	Sampling temperature, typically `0`–`2`. Higher is more random and varied; lower is more deterministic.
`top_p`	number	No	Nucleus-sampling threshold; the model samples only from candidates whose cumulative probability reaches `top_p`. Usually adjust either `temperature` or `top_p`, not both.
`stop`	string or string[]	No	One or more stop sequences. Generation halts at the first match, and the sequence itself isn't included in the output.
`tools`	array	No	The tools the model may call. See Function tools below.
`tool_choice`	string or object	No	Controls whether and how the model calls tools: `auto`, `none`, `required`, or an object naming a specific function.
`reasoning_effort`	string	No	How much reasoning the model spends before answering: `none`, `minimal`, `low`, `medium`, `high`, `xhigh` — other values return an error. The effect depends on whether the model is a reasoning model.
`safety_identifier`	string	No	A stable identifier for your end user, for safety and abuse detection. Pass a hashed user ID — not a raw email or similar.
`user`	string	No	Legacy alias for `safety_identifier`. When both are present, `safety_identifier` wins.

Extended parameters

Beyond the above, you can pass the following common parameters. Whether they take effect depends on the model — not every model supports them.

Parameter	Notes
`presence_penalty`, `frequency_penalty`	Sampling controls for repetition and topic drift.
`response_format`	Constrains the output format, e.g. JSON mode or structured output.
`parallel_tool_calls`	Whether the model may issue multiple tool calls in parallel within one reply.
`n`	Generate multiple candidate replies for the same messages.
`logprobs`, `top_logprobs`	Return log-probability information for generated tokens.
`modalities`, `audio`	Request non-text output modalities, e.g. audio. Only for models that support it.
`prediction`	Provide known output to help capable models generate faster.
`seed`	Fix the random seed for more reproducible results on supported models.
`metadata`, `store`, `service_tier`, `verbosity`, `web_search_options`	Extended capabilities specific to individual models.

Message object

messages is a chronological array with at least one message.

Field	Type	Required	Notes
`role`	string	Yes	The sender's role: `system`, `developer`, `user`, `assistant`, or `tool`. Other values return an error.
`content`	string, array, or null	No	The message content. Use a string for plain text; use a content-block array for multimodal content (see Content blocks); an `assistant` message that only issues tool calls may be `null`.
`name`	string	No	The sender's name, to distinguish participants within the same role.
`tool_calls`	array	No	Tool calls issued by an `assistant` message.
`tool_call_id`	string	Conditional	Required when `role` is `tool`; points at the corresponding tool call in the previous `assistant` message, linking the result to the call.

What each role is for:

system / developer: set the model's overall behavior and instructions.
user: the end user's input.
assistant: the model's reply — text and/or tool calls.
tool: the result of running a tool, used together with tool_call_id.

Content blocks

When content is an array, each element is a content block with a type field:

type	Fields	Notes
`text`	`text`	A piece of text input.
`image_url`	`image_url.url`, `image_url.detail`	Image input. `url` can be a public image URL or a `data:image/...;base64,...` data URL; `detail` controls parsing fidelity.
`input_audio`	`input_audio.data`, `input_audio.format`	Audio input. `data` is base64-encoded audio; `format` is the audio format, e.g. `wav`, `mp3`.
`video_url`	`video_url.url`	Video input. `url` is a model-reachable video URL; supported on some models only.
`video`	`video`	Video key-frame input, usually an array of image URLs; supported on some models only.
`file`	`file`	File input; supported on some models only.

Function tools

With the tools field you declare functions the model can call. The model returns a structured call request when needed; your code runs it and passes the result back via a role: "tool" message.

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": {
            "city": { "type": "string" }
          },
          "required": ["city"]
        },
        "strict": true
      }
    }
  ],
  "tool_choice": "auto"
}

Field	Type	Required	Notes
`tools[].type`	string	Yes	Tool type; currently `function`.
`tools[].function.name`	string	Yes	The function name the model uses to reference the tool.
`tools[].function.description`	string	No	What the function does. A clear description helps the model decide when to call it.
`tools[].function.parameters`	object	No	JSON Schema for the function's parameters. Omit for no parameters.
`tools[].function.strict`	boolean	No	Whether to require the model to generate arguments strictly matching the schema.

Example request

curl https://api.crossmodel.ai/v1/chat/completions \
  -H "Authorization: Bearer cm-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v4-pro",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Hello!" }
    ],
    "temperature": 0.7,
    "max_completion_tokens": 256
  }'

Response

A non-streaming request returns a chat.completion object.

{
  "id": "chatcmpl_abc123",
  "object": "chat.completion",
  "created": 1720000000,
  "model": "deepseek/deepseek-v4-pro",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 18,
    "total_tokens": 42,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}

Field	Type	Notes
`id`	string	A unique ID for this reply.
`object`	string	Always `chat.completion`.
`created`	integer	When the reply was created, as a Unix timestamp (seconds).
`model`	string	The model ID used for this request.
`choices`	array	The generated replies, usually one.
`usage`	object	Token usage for this request, used for billing.

The choice object

Field	Type	Notes
`index`	integer	The reply's position in the `choices` array.
`message`	object	The generated `assistant` message.
`finish_reason`	string	Why it stopped: `stop` (normal), `length` (hit the token cap), `tool_calls` (switched to a tool call), `content_filter` (blocked by content filtering).

The assistant message

Field	Type	Notes
`role`	string	Always `assistant`.
`content`	string or null	The generated text. `null` if the model only issued tool calls.
`tool_calls`	array	The function tools the model requested.

The usage object

Field	Type	Notes
`prompt_tokens`	integer	Input tokens.
`completion_tokens`	integer	Output tokens.
`total_tokens`	integer	Input plus output tokens.
`prompt_tokens_details.cached_tokens`	integer	Input tokens that hit the cache.
`completion_tokens_details.reasoning_tokens`	integer	Tokens spent on reasoning.

Every response carries an x-request-id header. Including it when you report a problem helps us track it down fast.

Streaming response

Set stream to true and the endpoint returns text/event-stream. The reply is split into chunks: each event starts with data: and carries a chat.completion.chunk object, ending with data: [DONE].

curl https://api.crossmodel.ai/v1/chat/completions \
  -H "Authorization: Bearer cm-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v4-pro",
    "messages": [{ "role": "user", "content": "Hi" }],
    "stream": true
  }'

data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1720000000,"model":"deepseek/deepseek-v4-pro","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
 
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1720000000,"model":"deepseek/deepseek-v4-pro","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
 
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1720000000,"model":"deepseek/deepseek-v4-pro","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
 
data: [DONE]

Each chunk's delta carries the newly added content; concatenate every delta.content to rebuild the full reply. With stream_options.include_usage set to true, a final chunk with empty choices and a usage field is pushed before the end, giving you the total usage for the request.

Errors

On error, the endpoint returns the matching HTTP status code with this shared error shape:

{
  "error": {
    "message": "Missing Bearer token in Authorization header.",
    "type": "authentication_error",
    "param": null,
    "code": "missing_api_key"
  }
}

Field	Notes
`message`	A human-readable description.
`type`	The broad error category.
`param`	The offending parameter, or `null` when there isn't a specific one.
`code`	A specific error code to branch on in code.

HTTP status	`type`	Common `code`	Notes
`400`	`invalid_request_error`	`invalid_json`, `invalid_messages`, `unsupported_message_role`, `unsupported_parameter`, `tool_call_id_mismatch`	Malformed body JSON, message structure, or parameters.
`401`	`authentication_error`	`missing_api_key`, `invalid_api_key`	API key missing or invalid.
`402`	`billing_error`	`insufficient_balance`	Account out of balance.
`404`	`invalid_request_error`	`model_not_found`	The requested model doesn't exist or is currently unavailable.
`429`	`rate_limit_error`	`rate_limit_exceeded`, `provider_rate_limited`	Too many requests — rate limited.
`502`	`provider_error`	`provider_error`	The model service returned an error or an abnormal response.
`503`	`api_error`	`model_unavailable`	Model temporarily unavailable — retry shortly.
`500`	`api_error`	`internal_error`	A CrossModel internal error.