SDK Integration
The change
- baseURL: "https://api.anthropic.com" + baseURL: "http://localhost:7421"
That's the entire integration. The daemon accepts the Anthropic /v1/messages HTTP shape — same request body, same response shape, same content blocks, same tools field, same image/document blocks.
Your SDK doesn't know the difference. Your model doesn't either.
Python (anthropic SDK)
from anthropic import Anthropic
client = Anthropic(
api_key="unused",
base_url="http://localhost:7421",
)
msg = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Reply with: OK"}],
)
print(msg.content[0].text)Streaming note: SSE streaming via stream: true is on the roadmap but not yet implementedin v1.1. The daemon parses the field and returns a buffered JSON response regardless. See the "What's NOT supported (today)" section below.
TypeScript / Node (@anthropic-ai/sdk)
import Anthropic from "@anthropic-ai/sdk";
const client = new Anthropic({
apiKey: "unused",
baseURL: "http://localhost:7421",
});
const msg = await client.messages.create({
model: "claude-sonnet-4-6",
max_tokens: 1024,
messages: [{ role: "user", content: "Reply with: OK" }],
});
console.log(msg.content[0].text);Go (anthropic-sdk-go)
package main
import (
"context"
"fmt"
"github.com/anthropics/anthropic-sdk-go"
"github.com/anthropics/anthropic-sdk-go/option"
)
func main() {
client := anthropic.NewClient(
option.WithAPIKey("unused"),
option.WithBaseURL("http://localhost:7421"),
)
msg, err := client.Messages.New(context.TODO(), anthropic.MessageNewParams{
Model: anthropic.F(anthropic.ModelClaudeSonnet4_6),
MaxTokens: anthropic.F(int64(1024)),
Messages: anthropic.F([]anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock("Reply with: OK")),
}),
})
if err != nil {
panic(err)
}
fmt.Println(msg.Content[0].Text)
}Rust (reqwest)
The daemon itself ships in Rust; dogfood example:
use serde_json::json;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
let client = reqwest::Client::new();
let body = json!({
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Reply with: OK"}]
});
let resp: serde_json::Value = client
.post("http://localhost:7421/v1/messages")
.json(&body)
.send()
.await?
.json()
.await?;
println!("{}", resp["content"][0]["text"].as_str().unwrap_or("?"));
Ok(())
}curl (any language with HTTP)
curl -X POST http://localhost:7421/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 1024,
"messages": [{"role": "user", "content": "Reply with: OK"}]
}'What works without changes
Every field on the standard messages.create request shape passes through transparently. The messages[] array (with role plus either a string content or an array of content blocks), top-level system prompts, sampling parameters (max_tokens, temperature, top_p) — the daemon parses the Anthropic shape and forwards unchanged.
Tool use works via the daemon's bundled MCP server. Send tools[] on the request body with the standard Anthropic shape (name, description, input_schema); tool calls come back as tool_use content blocks on the response. See Tools for the round-trip protocol.
Vision and document attachments ship as base64 content blocks inline in messages[].content — same shape the Anthropic API expects. The daemon writes the payload to a tempfile and mentions it to claude via @path. No size cap at this layer. See Attachments.
Multi-turn messages[] history passes through verbatim. By default the daemon renders the full history into a single prompt for a fresh Session — preserving the stateless contract. Set X-IR-Session-ID to opt into sticky multi-turn where the daemon maintains conversation state across calls.
What requires opt-in headers
X-IR-Session-ID (optional) routes the call to a sticky Session for multi-turn continuity:
curl -X POST http://localhost:7421/v1/messages \ -H "Content-Type: application/json" \ -H "X-IR-Session-ID: planner-loop-1" \ -d '...'
Without the header, the daemon mints a UUID per request → each call is stateless. Session id values must be ≤ 128 characters; longer values are silently ignored. See Sessions.
Streaming (v1.1.9+)
SDK calls with stream: true receive a standard Anthropic SSE event stream. Each text_deltaarrives at ~100ms intervals (the daemon's PTY poll cadence) as Claude renders the response — no longer buffered until end-of-turn. Tool-use blocks emit in a burst after the cooldown marker observes, matching Anthropic's wire protocol.
with client.messages.stream(
model="claude-sonnet-4-6",
max_tokens=1024,
messages=[{"role": "user", "content": "Explain monoids in two sentences."}],
) as stream:
for text in stream.text_stream:
print(text, end="", flush=True)What's NOT supported (today)
- API keys. The
api_keyfield is accepted but unused on the claude-cli path. The daemon authenticates via the license key entered at first launch + your Claude Code subscription. Don't put a real Anthropic API key inapi_key; it has no effect. anthropic_versionheader. Daemon ignores it; the SDK and the daemon always negotiate the same Anthropic-shape internally.anthropic-betafeatures not present in the bundled Claude Code. If your code uses a beta block, the daemon may pass it to claude and claude may not understand it.metadata,service_tierrequest fields — accepted, no-op.
Verification
# is the daemon up? curl http://localhost:7421/v1/health # is the license valid? curl http://localhost:7421/v1/license | jq .license.valid # how's the pool? curl http://localhost:7421/v1/sessions | jq .pool
OpenAI SDK (v1.1.15+)
The daemon also accepts the OpenAI Chat Completions wire shape at POST /v1/chat/completions. Apps coded against openai-python / @openai/openai-node / any OpenAI-compatible SDK point at IR with the same one-line change. Translation is bidirectional: request body, response shape, tool calling, and streaming all work.
from openai import OpenAI
client = OpenAI(
api_key="unused", # required by SDK, ignored by daemon
base_url="http://localhost:7421/v1", # the only line that changes
)
resp = client.chat.completions.create(
model="gpt-4o", # passed through; daemon's claude doesn't care
messages=[{"role": "user", "content": "Reply with: OK"}],
max_tokens=30,
)
print(resp.choices[0].message.content)Streaming works identically:
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Count 1 to 5"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="", flush=True)Tool calling works on the OpenAI function-calling shape — passed straight through to claude as Anthropic tool_use blocks.tool_calls come back on the response message; round-trip tool-role messages with tool_call_id + content for the next turn.
Vision (v1.1.16+) — image_url blocks with data:URLs work transparently. Claude reads the embedded image via the daemon's attachment writer. Remote http(s) URLs are not supported (would add an SSRF surface for marginal benefit) — fetch yourself and pass as data:image/png;base64,....
import base64
from openai import OpenAI
client = OpenAI(api_key="unused", base_url="http://localhost:7421/v1")
with open("chart.png", "rb") as f:
b64 = base64.b64encode(f.read()).decode()
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{
"role": "user",
"content": [
{"type": "text", "text": "What's in this chart?"},
{"type": "image_url", "image_url": {"url": f"data:image/png;base64,{b64}"}},
],
}],
max_tokens=300,
)
print(resp.choices[0].message.content)Not supported: legacy function_call field (use tools + tool_calls instead), audio modalities.
SDK quirks
A few notes that have surfaced in production integrations:
- Either SDK works. Anthropic SDK against
/v1/messagesOR OpenAI SDK against/v1/chat/completions. Same daemon, same subscription, two wire shapes. Pick whichever your app is already written for. - The Anthropic SDK retries 429s by default. The daemon doesn't emit 429s (no rate limit at this layer — your subscription pool is the limit). Disable SDK retries if you want explicit error handling.
- Long contexts work — the daemon bracketed-paste-routes prompts through the PTY, no argv length cap. Verified with the Bitcoin whitepaper (21KB).
Where to go next
- Pin a sticky session for multi-turn → Sessions
- Build an agent orchestrator → Agents Cookbook
- Tool use → Tools
- Vision / PDF / text attachments → Attachments
- Full HTTP surface → API Reference