Skip to main content
POST /v1/chat/completions Creates a chat completion. This is the primary endpoint for generating AI responses. Requests are routed to the appropriate government cloud provider based on the model you specify. Supports text, multi-turn conversations, tool use, and image input (vision).

Request

Headers

HeaderRequiredDescription
x-api-keyYesYour API key
Content-TypeYesapplication/json

Body Parameters

ParameterTypeRequiredDescription
modelstringYesModel ID to use (see Models)
messagesarrayYesList of messages (max 256)
temperaturefloatNoSampling temperature, 0.0 to 2.0
max_tokensintegerNoMaximum tokens to generate
top_pfloatNoNucleus sampling parameter, 0.0 to 1.0
streambooleanNoWhether to stream the response via SSE
stream_optionsobjectNoStreaming options (e.g. {"include_usage": true})
stopstring or arrayNoStop sequence(s) to end generation
toolsarrayNoTools the model may call (max 128). See Tool Use.
tool_choicestring or objectNoControls tool selection: auto, none, required, or a specific tool

Message Object

FieldTypeRequiredDescription
rolestringYesOne of system, user, assistant, or tool
contentstring or arrayYesThe message content. Can be a plain string (max 1 MB) or an array of content parts (text and image_url). See Image Input.
tool_callsarrayNoTool calls made by the assistant (on assistant messages, max 128)
tool_call_idstringNoID of the tool call this message responds to (on tool messages, max 256 chars)

Response

Non-Streaming

{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "claude-3-7-sonnet:il5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": "Hello! How can I help you today?"
      },
      "finish_reason": "stop"
    }
  ],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 12,
    "total_tokens": 22
  }
}

Streaming

Set stream: true to receive the response as Server-Sent Events (SSE). Each event is a JSON chunk prefixed with data: , ending with data: [DONE].
data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1700000000,"model":"claude-3-7-sonnet:il5","choices":[{"index":0,"delta":{"role":"assistant"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1700000000,"model":"claude-3-7-sonnet:il5","choices":[{"index":0,"delta":{"content":"Hello!"},"finish_reason":null}]}

data: {"id":"chatcmpl-abc123","object":"chat.completion.chunk","created":1700000000,"model":"claude-3-7-sonnet:il5","choices":[{"index":0,"delta":{},"finish_reason":"stop"}]}

data: [DONE]
Set stream_options: {"include_usage": true} to receive token usage in the final chunk.

Finish Reasons

ValueMeaning
stopNatural end of response or stop sequence hit
lengthHit max_tokens limit
tool_callsThe model is invoking one or more tools
content_filterContent was filtered by the provider’s safety system

Tool Use

Pass a tools array to let the model call functions. The model will respond with tool_calls when it wants to use a tool, and you send back the result in a tool message.

Tool Definition

Function name must match ^[a-zA-Z0-9_-]{1,64}$ (ASCII letters, digits, underscore, hyphen; 1 to 64 characters), matching the OpenAI and Anthropic tool name specs. Names outside this pattern return 400 invalid_request_error. description is limited to 65,536 characters.
{
  "tools": [
    {
      "type": "function",
      "function": {
        "name": "get_weather",
        "description": "Get weather for a location",
        "parameters": {
          "type": "object",
          "properties": {
            "location": { "type": "string" }
          },
          "required": ["location"]
        }
      }
    }
  ]
}

Tool Call Response

When the model calls a tool, the response includes tool_calls instead of (or alongside) content:
{
  "choices": [
    {
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "get_weather",
              "arguments": "{\"location\": \"Washington, DC\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ]
}

Sending Tool Results

Include the tool call result in a follow-up message with role: "tool":
{
  "messages": [
    {"role": "user", "content": "What's the weather in DC?"},
    {"role": "assistant", "content": null, "tool_calls": [{"id": "call_abc123", "type": "function", "function": {"name": "get_weather", "arguments": "{\"location\": \"Washington, DC\"}"}}]},
    {"role": "tool", "tool_call_id": "call_abc123", "content": "72°F and sunny"}
  ]
}

Rejected Tool Schemas

The gateway rejects tool definitions whose parameter schemas include property names that clearly describe an outbound destination. These schemas are the shape of a data exfiltration tool, and accepting them at the gateway would be careless regardless of what the caller intends to do with the result. Rejected property names (case insensitive):
  • Destination names: destination, destination_url, dest_url, dst_url
  • Webhook names: webhook, webhook_url, webhooks
  • Callback names: callback, callback_url
  • Send and forward names: forward_to, forward_url, send_to, post_to, push_to
  • Target names: target_url, target_host
  • Named sinks: upload_url, ingest_url, notification_url, notify_url, report_url, sink_url
  • Obvious intent: exfil_url, exfiltrate
The check walks the full JSON Schema tree, so hiding a denied name inside a nested property, an array items schema, a $defs entry, or a oneOf branch will not bypass it. Ambiguous names that can legitimately read as well as write are not rejected. url, uri, endpoint, host, and hostname all pass at this layer. A database connection tool with host and port, or a tool that reads a record from an internal API by url, continues to work. The runtime check described in the next section handles what actually appears in the arguments. A rejected request returns 400 invalid_request_error and identifies the offending tool and parameter:
{
  "error": {
    "type": "invalid_request_error",
    "message": "tools -> 0 -> function: Value error, Tool 'save_results' has a parameter named 'destination_url' that suggests an outbound destination. Tools with destination-bearing parameters are rejected by gateway governance to prevent data exfiltration via tool calls. If you have a legitimate use case for this parameter name, contact Consus support for an exception."
  }
}
If you have a real business case for a parameter that matches a rejected name, contact us and we can work through the exception together.

Tool Call Governance Metadata

When a model returns tool calls, the gateway scans each arguments payload for outbound destinations (URLs with a scheme like https://, ftp://, s3://, data:, mailto:, and raw IPv4 addresses). When any are found, the response includes an advisory field called x_consus_governance alongside the standard OpenAI body. The tool call itself is not modified. You still receive the real arguments so your application can run. Response shape when destinations are detected:
{
  "id": "chatcmpl-abc123",
  "object": "chat.completion",
  "created": 1700000000,
  "model": "claude-3-7-sonnet:il5",
  "choices": [
    {
      "index": 0,
      "message": {
        "role": "assistant",
        "content": null,
        "tool_calls": [
          {
            "id": "call_abc123",
            "type": "function",
            "function": {
              "name": "save_results",
              "arguments": "{\"url\": \"https://collector.example.com/ingest\", \"data\": \"...\"}"
            }
          }
        ]
      },
      "finish_reason": "tool_calls"
    }
  ],
  "usage": { "prompt_tokens": 20, "completion_tokens": 30, "total_tokens": 50 },
  "x_consus_governance": {
    "flags": [
      {
        "tool_call_id": "call_abc123",
        "tool_name": "save_results",
        "destinations": ["https://collector.example.com/ingest"],
        "reason": "external_destination"
      }
    ]
  }
}
When no destinations are found, the field is absent from the response. This is an advisory signal. We do not block or redact the tool call. The purpose is to give your application something structured to act on before you execute a tool call whose destination came from the model output. The typical handling pattern is: check for the field, look up each destination against whatever allowlist or policy your application runs under, and surface it to a human or your policy engine when it looks unfamiliar. Streaming responses carry the same field in the final SSE chunk (the chunk that includes finish_reason). Clients that already parse the final chunk for usage can pick up x_consus_governance from the same place.

Examples

Basic Completion

curl -X POST https://api.consus.io/v1/chat/completions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-7-sonnet:il5",
    "messages": [
      {"role": "user", "content": "What is FedRAMP?"}
    ]
  }'

Streaming

curl -X POST https://api.consus.io/v1/chat/completions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-7-sonnet:il5",
    "messages": [
      {"role": "user", "content": "What is FedRAMP?"}
    ],
    "stream": true
  }'

With System Prompt and Parameters

curl -X POST https://api.consus.io/v1/chat/completions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5:il5",
    "messages": [
      {"role": "system", "content": "You are a helpful government compliance assistant."},
      {"role": "user", "content": "Summarize CMMC Level 2 requirements."}
    ],
    "temperature": 0.3,
    "max_tokens": 2048
  }'

Multi-Turn Conversation

curl -X POST https://api.consus.io/v1/chat/completions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-3-7-sonnet:il5",
    "messages": [
      {"role": "user", "content": "What is an ATO?"},
      {"role": "assistant", "content": "An ATO (Authority to Operate) is a formal authorization..."},
      {"role": "user", "content": "How long does it typically take to get one?"}
    ]
  }'

Image Input (Vision)

All available models (claude-3-7-sonnet:il5, claude-sonnet-4-5:il5, gemini-2-5-pro:il5, gemini-2-5-flash:il5) accept images. Supported formats: jpeg, png, gif, webp

How images are sent

To include an image, set content to an array instead of a plain string. Each element is a content part with a type field, either "text" or "image_url".
The image_url field name is inherited from OpenAI’s API format. Despite the name, you do not pass a URL. You pass the raw image bytes encoded as a base64 data URI. External URLs (https://...) are rejected with 400 to prevent data exfiltration.
A base64 data URI looks like this:
data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAA...
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
│         │         │    │
│         │         │    └─ your base64-encoded image bytes
│         │         └─ encoding must be "base64"
│         └─ MIME type, must match your actual image (image/jpeg, image/png, image/gif, image/webp)
└─ always starts with "data:"
You encode the raw file bytes to base64, prefix with data:<mime-type>;base64,, and put the whole string in the url field.

Request Format

{
  "model": "claude-3-7-sonnet:il5",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "What does this screenshot show?"
        },
        {
          "type": "image_url",
          "image_url": {
            "url": "data:image/png;base64,<YOUR_BASE64_BYTES_HERE>"
          }
        }
      ]
    }
  ]
}
  • content can contain any number of text and image_url parts in any order
  • Images are only valid in user messages. system and assistant messages must use a plain string for content.
  • Multiple images per message are supported (up to 20)

Size Limits

LimitValue
Per image (raw decoded)3.5 MB
Total image data per message (base64)4.5 MB
Max images per message20
Requests exceeding these limits are rejected with 400.

Integrations handle this for you

If you’re using an integration like OpenCode or Cline, you don’t need to do any of this manually. Those tools encode images automatically when you paste a screenshot. Just paste and send. The base64 encoding and data URI formatting is handled behind the scenes. If you’re calling the API directly, read on.

Complete Examples

bash (curl)
# 1. Base64-encode your image file
IMAGE_B64=$(base64 -i screenshot.png)   # macOS
# IMAGE_B64=$(base64 -w 0 screenshot.png) # Linux

# 2. Send it. The mime type in the data URI must match your file (image/png, image/jpeg, etc.)
curl -X POST https://api.consus.io/v1/chat/completions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"claude-3-7-sonnet\",
    \"messages\": [{
      \"role\": \"user\",
      \"content\": [
        {\"type\": \"text\", \"text\": \"What does this screenshot show?\"},
        {\"type\": \"image_url\", \"image_url\": {\"url\": \"data:image/png;base64,${IMAGE_B64}\"}}
      ]
    }]
  }"
Python
import base64
import httpx

# 1. Read and encode the image
with open("screenshot.png", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

# 2. Send. Set the mime type to match your file.
httpx.post(
    "https://api.consus.io/v1/chat/completions",
    headers={"x-api-key": "YOUR_API_KEY"},
    json={
        "model": "claude-3-7-sonnet:il5",
        "messages": [
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "What does this screenshot show?"},
                    {"type": "image_url", "image_url": {"url": f"data:image/png;base64,{b64}"}},
                ],
            }
        ],
    },
)

Document Input (PDF)

All Claude and Gemini models support PDF document inputs. Supported file types: application/pdf

How documents are sent

To include a PDF, add a content part with "type": "file" to the content array. The file_data field must be a base64 data URI. External URLs are rejected for the same no-egress reason as images.
{
  "model": "claude-3-7-sonnet:il5",
  "messages": [
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "Summarize the key findings in this report."
        },
        {
          "type": "file",
          "file": {
            "filename": "report.pdf",
            "file_data": "data:application/pdf;base64,<YOUR_BASE64_BYTES_HERE>"
          }
        }
      ]
    }
  ]
}
  • Documents are only valid in user messages. system and assistant messages must use plain strings.
  • filename is metadata passed to the model; it is treated as untrusted input and sanitized before use
  • Text, images, and files can be mixed in a single message’s content array

Size Limits

LimitValue
Per file (raw decoded)3.5 MB
Total base64 content per message (images + files combined)4.5 MB
Max files per message5
The combined image + file budget is shared at 4.5 MB per message, enforced by Lambda’s 6 MB payload ceiling.

curl example

PDF_B64=$(base64 -i report.pdf)   # macOS
# PDF_B64=$(base64 -w 0 report.pdf) # Linux

curl -X POST https://api.consus.io/v1/chat/completions \
  -H "x-api-key: YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d "{
    \"model\": \"claude-3-7-sonnet:il5\",
    \"messages\": [{
      \"role\": \"user\",
      \"content\": [
        {\"type\": \"text\", \"text\": \"Summarize this document.\"},
        {\"type\": \"file\", \"file\": {\"filename\": \"report.pdf\", \"file_data\": \"data:application/pdf;base64,${PDF_B64}\"}}
      ]
    }]
  }"

Python example

import base64
import httpx

with open("report.pdf", "rb") as f:
    b64 = base64.b64encode(f.read()).decode()

httpx.post(
    "https://api.consus.io/v1/chat/completions",
    headers={"x-api-key": "YOUR_API_KEY"},
    json={
        "model": "claude-3-7-sonnet:il5",
        "messages": [
            {
                "role": "user",
                "content": [
                    {"type": "text", "text": "Summarize this document."},
                    {"type": "file", "file": {"filename": "report.pdf", "file_data": f"data:application/pdf;base64,{b64}"}},
                ],
            }
        ],
    },
)