CrossModel

Chat Completions

POST /v1/chat/completions

Send a list of conversation messages to a model and get a generated reply. This is CrossModel's core endpoint — it supports multi-turn chat, streaming, image input, and function/tool calling.

This endpoint is fully compatible with OpenAI Chat Completions. If you already have a project on the OpenAI SDK, just point base_url at CrossModel and set model to a CrossModel model ID — nothing else needs to change.

Endpoint

POST https://api.crossmodel.ai/v1/chat/completions
Authorization: Bearer cm-YOUR_KEY
Content-Type: application/json

Every request must carry a Bearer API key in the Authorization header. Create and manage keys on the console's API Keys page.

$
Create chat completion
1curl https://api.crossmodel.ai/v1/chat/completions \2  -H "Content-Type: application/json" \3  -H "Authorization: Bearer $CROSSMODEL_API_KEY" \4  -d '{5    "model": "deepseek/deepseek-v4-pro",6    "messages": [7      {8        "role": "system",9        "content": "You are a helpful assistant."10      },11      {12        "role": "user",13        "content": "Hello!"14      }15    ]16  }'

Request parameters

ParameterTypeRequiredNotes
modelstringYesThe model ID to use, e.g. deepseek/deepseek-v4-pro. Call /v1/models for the full list.
messagesarrayYesThe conversation so far, in chronological order, with at least one entry. See Message object below.
streambooleanNoWhether to stream the result. When true, the reply is pushed as Server-Sent Events. Default false.
stream_optionsobjectNoExtra streaming options; supports include_usage. Only applies when stream is true.
max_completion_tokensintegerNoMax tokens to generate for this reply. Takes precedence over max_tokens when both are present.
max_tokensintegerNoMax tokens to generate for this reply. Legacy parameter — new code should use max_completion_tokens.
temperaturenumberNoSampling temperature, typically 02. Higher is more random and varied; lower is more deterministic.
top_pnumberNoNucleus-sampling threshold; the model samples only from candidates whose cumulative probability reaches top_p. Usually adjust either temperature or top_p, not both.
stopstring or string[]NoOne or more stop sequences. Generation halts at the first match, and the sequence itself isn't included in the output.
toolsarrayNoThe tools the model may call. See Function tools below.
tool_choicestring or objectNoControls whether and how the model calls tools: auto, none, required, or an object naming a specific function.
reasoning_effortstringNoHow much reasoning the model spends before answering: none, minimal, low, medium, high, xhigh — other values return an error. The effect depends on whether the model is a reasoning model.
safety_identifierstringNoA stable identifier for your end user, for safety and abuse detection. Pass a hashed user ID — not a raw email or similar.
userstringNoLegacy alias for safety_identifier. When both are present, safety_identifier wins.

Extended parameters

Beyond the above, you can pass the following common parameters. Whether they take effect depends on the model — not every model supports them.

ParameterNotes
presence_penalty, frequency_penaltySampling controls for repetition and topic drift.
response_formatConstrains the output format, e.g. JSON mode or structured output.
parallel_tool_callsWhether the model may issue multiple tool calls in parallel within one reply.
nGenerate multiple candidate replies for the same messages.
logprobs, top_logprobsReturn log-probability information for generated tokens.
modalities, audioRequest non-text output modalities, e.g. audio. Only for models that support it.
predictionProvide known output to help capable models generate faster.
seedFix the random seed for more reproducible results on supported models.
metadata, store, service_tier, verbosity, web_search_optionsExtended capabilities specific to individual models.

Message object

messages is a chronological array with at least one message.

FieldTypeRequiredNotes
rolestringYesThe sender's role: system, developer, user, assistant, or tool. Other values return an error.
contentstring, array, or nullNoThe message content. Use a string for plain text; use a content-block array for multimodal content (see Content blocks); an assistant message that only issues tool calls may be null.
namestringNoThe sender's name, to distinguish participants within the same role.
tool_callsarrayNoTool calls issued by an assistant message.
tool_call_idstringConditionalRequired when role is tool; points at the corresponding tool call in the previous assistant message, linking the result to the call.

What each role is for:

  • system / developer: set the model's overall behavior and instructions.
  • user: the end user's input.
  • assistant: the model's reply — text and/or tool calls.
  • tool: the result of running a tool, used together with tool_call_id.

Content blocks

When content is an array, each element is a content block with a type field:

typeFieldsNotes
texttextA piece of text input.
image_urlimage_url.url, image_url.detailImage input. url can be a public image URL or a data:image/...;base64,... data URL; detail controls parsing fidelity.
input_audioinput_audio.data, input_audio.formatAudio input. data is base64-encoded audio; format is the audio format, e.g. wav, mp3.
video_urlvideo_url.urlVideo input. url is a model-reachable video URL; supported on some models only.
videovideoVideo key-frame input, usually an array of image URLs; supported on some models only.
filefileFile input; supported on some models only.

Function tools

With the tools field you declare functions the model can call. The model returns a structured call request when needed; your code runs it and passes the result back via a role: "tool" message.

{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get current weather for a city.",
        "parameters": {
          "type": "object",
          "properties": {
            "city": { "type": "string" }
          },
          "required": ["city"]
        },
        "strict": true
      }
    }
  ],
  "tool_choice": "auto"
}
FieldTypeRequiredNotes
tools[].typestringYesTool type; currently function.
tools[].function.namestringYesThe function name the model uses to reference the tool.
tools[].function.descriptionstringNoWhat the function does. A clear description helps the model decide when to call it.
tools[].function.parametersobjectNoJSON Schema for the function's parameters. Omit for no parameters.
tools[].function.strictbooleanNoWhether to require the model to generate arguments strictly matching the schema.

Example request

curl https://api.crossmodel.ai/v1/chat/completions \
  -H "Authorization: Bearer cm-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v4-pro",
    "messages": [
      { "role": "system", "content": "You are a helpful assistant." },
      { "role": "user", "content": "Hello!" }
    ],
    "temperature": 0.7,
    "max_completion_tokens": 256
  }'

Response

A non-streaming request returns a chat.completion object.

{
  "id": "chatcmpl_abc123",
  "object": "chat.completion",
  "created": 1720000000,
  "model": "deepseek/deepseek-v4-pro",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 24,
    "completion_tokens": 18,
    "total_tokens": 42,
    "prompt_tokens_details": {
      "cached_tokens": 0
    },
    "completion_tokens_details": {
      "reasoning_tokens": 0
    }
  }
}
FieldTypeNotes
idstringA unique ID for this reply.
objectstringAlways chat.completion.
createdintegerWhen the reply was created, as a Unix timestamp (seconds).
modelstringThe model ID used for this request.
choicesarrayThe generated replies, usually one.
usageobjectToken usage for this request, used for billing.

The choice object

FieldTypeNotes
indexintegerThe reply's position in the choices array.
messageobjectThe generated assistant message.
finish_reasonstringWhy it stopped: stop (normal), length (hit the token cap), tool_calls (switched to a tool call), content_filter (blocked by content filtering).

The assistant message

FieldTypeNotes
rolestringAlways assistant.
contentstring or nullThe generated text. null if the model only issued tool calls.
tool_callsarrayThe function tools the model requested.

The usage object

FieldTypeNotes
prompt_tokensintegerInput tokens.
completion_tokensintegerOutput tokens.
total_tokensintegerInput plus output tokens.
prompt_tokens_details.cached_tokensintegerInput tokens that hit the cache.
completion_tokens_details.reasoning_tokensintegerTokens spent on reasoning.

Every response carries an x-request-id header. Including it when you report a problem helps us track it down fast.

Streaming response

Set stream to true and the endpoint returns text/event-stream. The reply is split into chunks: each event starts with data: and carries a chat.completion.chunk object, ending with data: [DONE].

curl https://api.crossmodel.ai/v1/chat/completions \
  -H "Authorization: Bearer cm-YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek/deepseek-v4-pro",
    "messages": [{ "role": "user", "content": "Hi" }],
    "stream": true
  }'
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1720000000,"model":"deepseek/deepseek-v4-pro","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}
 
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1720000000,"model":"deepseek/deepseek-v4-pro","choices":[{"index":0,"delta":{"content":"Hello"},"finish_reason":null}]}
 
data: {"id":"chatcmpl_abc123","object":"chat.completion.chunk","created":1720000000,"model":"deepseek/deepseek-v4-pro","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}
 
data: [DONE]

Each chunk's delta carries the newly added content; concatenate every delta.content to rebuild the full reply. With stream_options.include_usage set to true, a final chunk with empty choices and a usage field is pushed before the end, giving you the total usage for the request.

Errors

On error, the endpoint returns the matching HTTP status code with this shared error shape:

{
  "error": {
    "message": "Missing Bearer token in Authorization header.",
    "type": "authentication_error",
    "param": null,
    "code": "missing_api_key"
  }
}
FieldNotes
messageA human-readable description.
typeThe broad error category.
paramThe offending parameter, or null when there isn't a specific one.
codeA specific error code to branch on in code.
HTTP statustypeCommon codeNotes
400invalid_request_errorinvalid_json, invalid_messages, unsupported_message_role, unsupported_parameter, tool_call_id_mismatchMalformed body JSON, message structure, or parameters.
401authentication_errormissing_api_key, invalid_api_keyAPI key missing or invalid.
402billing_errorinsufficient_balanceAccount out of balance.
404invalid_request_errormodel_not_foundThe requested model doesn't exist or is currently unavailable.
429rate_limit_errorrate_limit_exceeded, provider_rate_limitedToo many requests — rate limited.
502provider_errorprovider_errorThe model service returned an error or an abnormal response.
503api_errormodel_unavailableModel temporarily unavailable — retry shortly.
500api_errorinternal_errorA CrossModel internal error.