Quickstart
Install inference-relay and make your first call from any language in five minutes.
Prerequisites
- An active Claude subscription (Pro, Max, or higher).
- A license key for inference-relay. Get one if you don't have it.
- macOS 12+, Windows 10+, or Linux with libc 2.31+.
1. Install the daemon
macOS — Apple silicon:
curl -L https://r2.inference-relay.com/desktop/1.1.14/macos-aarch64/inference-relay_1.1.14_aarch64.dmg \ -o ~/Downloads/inference-relay.dmg open ~/Downloads/inference-relay.dmg
macOS — Intel: download inference-relay_1.1.14_x64.dmg and open.
Windows: download inference-relay_1.1.14_x64-setup.exe and run.
Linux: AppImage, .deb, or .rpm.
After install, the daemon starts automatically and binds 127.0.0.1:7421. You can verify with:
curl http://localhost:7421/v1/health
# → {"status":"ok","version":"1.1.11"}2. First launch — clear the OS scare screen
We don't pay Apple's or Microsoft's certificate-of-trust tax. The cert authorities add latency, vendor lock-in, and zero security value for a localhost daemon. Your OS will warn you the app is “damaged” (macOS) or “unrecognized” (Windows). It isn't — it's just not signed by them. One command clears the misleading copy:
macOS
xattr -cr /Applications/inference-relay.app && \ codesign --force --deep --sign - /Applications/inference-relay.app
Strips quarantine + provenance attrs, then re-stamps an ad-hoc signature so Sequoia's hardened policy stops blocking. Then relaunch from Applications.
Windows
SmartScreen popup → More info → Run anyway. One click.
Linux
.deb and .rpm install normally via your package manager. .AppImage needs one chmod:
chmod +x ./inference-relay_*.AppImage && ./inference-relay_*.AppImage
We don't pay the certificate-of-trust tax. Three reasons:
- Building software should be free. The web didn't require Apple's or Microsoft's permission to publish; neither should desktop software. A $99 / $299 annual gate on shipping is a tax the platform owners extract from every developer, and we won't compound it onto our customers.
- The cert doesn't verify trust. Anyone with $99 can get an Apple Developer ID — including bad actors. The check is “did this person have a working credit card,” not “is this software safe.”
- Zero security benefit for a localhost daemon. inference-relay binds
127.0.0.1only; it never sees the network. The signing apparatus exists to vouch for software shipped over an untrusted distribution channel — that's not us.
3. Activate your license
On first run, the daemon prompts for your license key. Or set it via the environment:
IR_LICENSE_KEY=ir_live_xxxx inference-relay activate
4. Make your first call
From curl
curl -X POST http://localhost:7421/v1/messages \
-H "Content-Type: application/json" \
-d '{
"model": "claude-sonnet-4-6",
"max_tokens": 50,
"messages": [{"role": "user", "content": "hello"}]
}'From Python
from anthropic import Anthropic
client = Anthropic(
api_key="unused",
base_url="http://localhost:7421",
)
msg = client.messages.create(
model="claude-sonnet-4-6",
max_tokens=200,
messages=[{"role": "user", "content": "Explain monoids."}],
)
print(msg.content[0].text)From Node / TypeScript
import Anthropic from '@anthropic-ai/sdk';
const client = new Anthropic({
apiKey: 'unused',
baseURL: 'http://localhost:7421',
});
const msg = await client.messages.create({
model: 'claude-sonnet-4-6',
max_tokens: 200,
messages: [{ role: 'user', content: 'Explain monoids.' }],
});What just happened
Your call hit the daemon at localhost:7421. The daemon pulled a pre-warmed Claude Code session from the Session Pool (~10 ms), forwarded the request through that session — which authenticates against your Claude subscription, not an Anthropic API key — and streamed the response back as a buffered JSON message in Anthropic SDK shape.
Your application doesn't know anything has happened differently. The SDK is unchanged. Only the baseURL moved.
Next steps
- SDK Integration — language-by-language patterns.
- Headless Operation — survive application restarts.
- Sessions — sticky multi-turn via X-IR-Session-ID.
- Agents Cookbook — five recipes.