Chat Completions
POST /v1/chat/completions
Send a list of conversation messages to a model and get a generated reply. This is CrossModel's core endpoint — it supports multi-turn chat, streaming, image input, and function/tool calling.
This endpoint is fully compatible with OpenAI Chat Completions. If you already have a project on the OpenAI SDK, just point base_url at CrossModel and set model to a CrossModel model ID — nothing else needs to change.
Endpoint
POST https://api.crossmodel.ai/v1/chat/completions
Authorization: Bearer cm-YOUR_KEY
Content-Type: application/jsonEvery request must carry a Bearer API key in the Authorization header. Create and manage keys on the console's API Keys page.
1curl https://api.crossmodel.ai/v1/chat/completions \2 -H "Content-Type: application/json" \3 -H "Authorization: Bearer $CROSSMODEL_API_KEY" \4 -d '{5 "model": "deepseek/deepseek-v4-pro",6 "messages": [7 {8 "role": "system",9 "content": "You are a helpful assistant."10 },11 {12 "role": "user",13 "content": "Hello!"14 }15 ]16 }'Request parameters
| Parameter | Type | Required | Notes |
|---|---|---|---|
model | string | Yes | The model ID to use, e.g. deepseek/deepseek-v4-pro. Call /v1/models for the full list. |
messages | array | Yes | The conversation so far, in chronological order, with at least one entry. See Message object below. |
stream | boolean | No | Whether to stream the result. When true, the reply is pushed as Server-Sent Events. Default false. |
stream_options | object | No | Extra streaming options; supports include_usage. Only applies when stream is true. |
max_completion_tokens | integer | No | Max tokens to generate for this reply. Takes precedence over max_tokens when both are present. |
max_tokens | integer | No | Max tokens to generate for this reply. Legacy parameter — new code should use max_completion_tokens. |
temperature | number | No | Sampling temperature, typically 0–2. Higher is more random and varied; lower is more deterministic. |
top_p | number | No | Nucleus-sampling threshold; the model samples only from candidates whose cumulative probability reaches top_p. Usually adjust either temperature or top_p, not both. |
stop | string or string[] | No | One or more stop sequences. Generation halts at the first match, and the sequence itself isn't included in the output. |
tools | array | No | The tools the model may call. See Function tools below. |
tool_choice | string or object | No | Controls whether and how the model calls tools: auto, none, required, or an object naming a specific function. |
reasoning_effort | string | No | How much reasoning the model spends before answering: none, minimal, low, medium, high, xhigh — other values return an error. The effect depends on whether the model is a reasoning model. |
safety_identifier | string | No | A stable identifier for your end user, for safety and abuse detection. Pass a hashed user ID — not a raw email or similar. |
user | string | No | Legacy alias for safety_identifier. When both are present, safety_identifier wins. |
Extended parameters
Beyond the above, you can pass the following common parameters. Whether they take effect depends on the model — not every model supports them.
| Parameter | Notes |
|---|---|
presence_penalty, frequency_penalty | Sampling controls for repetition and topic drift. |
response_format | Constrains the output format, e.g. JSON mode or structured output. |
parallel_tool_calls | Whether the model may issue multiple tool calls in parallel within one reply. |
n | Generate multiple candidate replies for the same messages. |
logprobs, top_logprobs | Return log-probability information for generated tokens. |
modalities, audio | Request non-text output modalities, e.g. audio. Only for models that support it. |
prediction | Provide known output to help capable models generate faster. |
seed | Fix the random seed for more reproducible results on supported models. |
metadata, store, service_tier, verbosity, web_search_options | Extended capabilities specific to individual models. |
Message object
messages is a chronological array with at least one message.
| Field | Type | Required | Notes |
|---|---|---|---|
role | string | Yes | The sender's role: system, developer, user, assistant, or tool. Other values return an error. |
content | string, array, or null | No | The message content. Use a string for plain text; use a content-block array for multimodal content (see Content blocks); an assistant message that only issues tool calls may be null. |
name | string | No | The sender's name, to distinguish participants within the same role. |
tool_calls | array | No | Tool calls issued by an assistant message. |
tool_call_id | string | Conditional | Required when role is tool; points at the corresponding tool call in the previous assistant message, linking the result to the call. |
What each role is for:
system/developer: set the model's overall behavior and instructions.user: the end user's input.assistant: the model's reply — text and/or tool calls.tool: the result of running a tool, used together withtool_call_id.
Content blocks
When content is an array, each element is a content block with a type field:
| type | Fields | Notes |
|---|---|---|
text | text | A piece of text input. |
image_url | image_url.url, image_url.detail | Image input. url can be a public image URL or a data:image/...;base64,... data URL; detail controls parsing fidelity. |
input_audio | input_audio.data, input_audio.format | Audio input. data is base64-encoded audio; format is the audio format, e.g. wav, mp3. |
video_url | video_url.url | Video input. url is a model-reachable video URL; supported on some models only. |
video | video | Video key-frame input, usually an array of image URLs; supported on some models only. |
file | file | File input; supported on some models only. |
Function tools
With the tools field you declare functions the model can call. The model returns a structured call request when needed; your code runs it and passes the result back via a role: "tool" message.
{
"tools": [
{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city.",
"parameters": {
"type": "object",
"properties": {
"city": { "type": "string" }
},
"required": ["city"]
},
"strict": true
}
}
],
"tool_choice": "auto"
}| Field | Type | Required | Notes |
|---|---|---|---|
tools[].type | string | Yes | Tool type; currently function. |
tools[].function.name | string | Yes | The function name the model uses to reference the tool. |
tools[].function.description | string | No | What the function does. A clear description helps the model decide when to call it. |
tools[].function.parameters | object | No | JSON Schema for the function's parameters. Omit for no parameters. |
tools[].function.strict | boolean | No | Whether to require the model to generate arguments strictly matching the schema. |
Example request
curl https://api.crossmodel.ai/v1/chat/completions \
-H "Authorization: Bearer cm-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-v4-pro",
"messages": [
{ "role": "system", "content": "You are a helpful assistant." },
{ "role": "user", "content": "Hello!" }
],
"temperature": 0.7,
"max_completion_tokens": 256
}'Response
A non-streaming request returns a chat.completion object.
{
"id": "chatcmpl_abc123",
"object": "chat.completion",
"created": 1720000000,
"model": "deepseek/deepseek-v4-pro",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "Hello! How can I help?"
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 24,
"completion_tokens": 18,
"total_tokens": 42,
"prompt_tokens_details": {
"cached_tokens": 0
},
"completion_tokens_details": {
"reasoning_tokens": 0
}
}
}| Field | Type | Notes |
|---|---|---|
id | string | A unique ID for this reply. |
object | string | Always chat.completion. |
created | integer | When the reply was created, as a Unix timestamp (seconds). |
model | string | The model ID used for this request. |
choices | array | The generated replies, usually one. |
usage | object | Token usage for this request, used for billing. |
The choice object
| Field | Type | Notes |
|---|---|---|
index | integer | The reply's position in the choices array. |
message | object | The generated assistant message. |
finish_reason | string | Why it stopped: stop (normal), length (hit the token cap), tool_calls (switched to a tool call), content_filter (blocked by content filtering). |
The assistant message
| Field | Type | Notes |
|---|---|---|
role | string | Always assistant. |
content | string or null | The generated text. null if the model only issued tool calls. |
tool_calls | array | The function tools the model requested. |
The usage object
| Field | Type | Notes |
|---|---|---|
prompt_tokens | integer | Input tokens. |
completion_tokens | integer | Output tokens. |
total_tokens | integer | Input plus output tokens. |
prompt_tokens_details.cached_tokens | integer | Input tokens that hit the cache. |
completion_tokens_details.reasoning_tokens | integer | Tokens spent on reasoning. |
Every response carries an x-request-id header. Including it when you report a problem helps us track it down fast.
Streaming response
Set stream to true and the endpoint returns text/event-stream. The reply is split into chunks: each event starts with data: and carries a chat.completion.chunk object, ending with data: [DONE].
curl https://api.crossmodel.ai/v1/chat/completions \
-H "Authorization: Bearer cm-YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepseek/deepseek-v4-pro",
"messages": [{ "role": "user", "content": "Hi" }],
"stream": true
}'data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1720000000,"model":"deepseek/deepseek-v4-pro","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1720000000,"model":"deepseek/deepseek-v4-pro","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1720000000,"model":"deepseek/deepseek-v4-pro","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
data: [DONE]Each chunk's delta carries the newly added content; concatenate every delta.content to rebuild the full reply. With stream_options.include_usage set to true, a final chunk with empty choices and a usage field is pushed before the end, giving you the total usage for the request.
Errors
On error, the endpoint returns the matching HTTP status code with this shared error shape:
{
"error": {
"message": "Missing Bearer token in Authorization header.",
"type": "authentication_error",
"param": null,
"code": "missing_api_key"
}
}| Field | Notes |
|---|---|
message | A human-readable description. |
type | The broad error category. |
param | The offending parameter, or null when there isn't a specific one. |
code | A specific error code to branch on in code. |
| HTTP status | type | Common code | Notes |
|---|---|---|---|
400 | invalid_request_error | invalid_json, invalid_messages, unsupported_message_role, unsupported_parameter, tool_call_id_mismatch | Malformed body JSON, message structure, or parameters. |
401 | authentication_error | missing_api_key, invalid_api_key | API key missing or invalid. |
402 | billing_error | insufficient_balance | Account out of balance. |
404 | invalid_request_error | model_not_found | The requested model doesn't exist or is currently unavailable. |
429 | rate_limit_error | rate_limit_exceeded, provider_rate_limited | Too many requests — rate limited. |
502 | provider_error | provider_error | The model service returned an error or an abnormal response. |
503 | api_error | model_unavailable | Model temporarily unavailable — retry shortly. |
500 | api_error | internal_error | A CrossModel internal error. |