Quickstart

Install inference-relay and make your first call from any language in five minutes.

Prerequisites

  • An active Claude subscription (Pro, Max, or higher).
  • A license key for inference-relay. Get one if you don't have it.
  • macOS 12+, Windows 10+, or Linux with libc 2.31+.

1. Install the daemon

macOS — Apple silicon:

curl -L https://r2.inference-relay.com/desktop/1.1.14/macos-aarch64/inference-relay_1.1.14_aarch64.dmg \
  -o ~/Downloads/inference-relay.dmg
open ~/Downloads/inference-relay.dmg

macOS — Intel: download inference-relay_1.1.14_x64.dmg and open.

Windows: download inference-relay_1.1.14_x64-setup.exe and run.

Linux: AppImage, .deb, or .rpm.

After install, the daemon starts automatically and binds 127.0.0.1:7421. You can verify with:

curl http://localhost:7421/v1/health
# → {"status":"ok","version":"1.1.11"}

2. First launch — clear the OS scare screen

We don't pay Apple's or Microsoft's certificate-of-trust tax. The cert authorities add latency, vendor lock-in, and zero security value for a localhost daemon. Your OS will warn you the app is “damaged” (macOS) or “unrecognized” (Windows). It isn't — it's just not signed by them. One command clears the misleading copy:

macOS

xattr -cr /Applications/inference-relay.app && \
codesign --force --deep --sign - /Applications/inference-relay.app

Strips quarantine + provenance attrs, then re-stamps an ad-hoc signature so Sequoia's hardened policy stops blocking. Then relaunch from Applications.

Windows

SmartScreen popup → More infoRun anyway. One click.

Linux

.deb and .rpm install normally via your package manager. .AppImage needs one chmod:

chmod +x ./inference-relay_*.AppImage && ./inference-relay_*.AppImage

We don't pay the certificate-of-trust tax. Three reasons:

  • Building software should be free. The web didn't require Apple's or Microsoft's permission to publish; neither should desktop software. A $99 / $299 annual gate on shipping is a tax the platform owners extract from every developer, and we won't compound it onto our customers.
  • The cert doesn't verify trust. Anyone with $99 can get an Apple Developer ID — including bad actors. The check is “did this person have a working credit card,” not “is this software safe.”
  • Zero security benefit for a localhost daemon. inference-relay binds 127.0.0.1 only; it never sees the network. The signing apparatus exists to vouch for software shipped over an untrusted distribution channel — that's not us.

3. Activate your license

On first run, the daemon prompts for your license key. Or set it via the environment:

IR_LICENSE_KEY=ir_live_xxxx inference-relay activate

4. Make your first call

From curl

curl -X POST http://localhost:7421/v1/messages \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-6",
    "max_tokens": 50,
    "messages": [{"role": "user", "content": "hello"}]
  }'

From Python

from anthropic import Anthropic

client = Anthropic(
    api_key="unused",
    base_url="http://localhost:7421",
)

msg = client.messages.create(
    model="claude-sonnet-4-6",
    max_tokens=200,
    messages=[{"role": "user", "content": "Explain monoids."}],
)
print(msg.content[0].text)

From Node / TypeScript

import Anthropic from '@anthropic-ai/sdk';

const client = new Anthropic({
  apiKey: 'unused',
  baseURL: 'http://localhost:7421',
});

const msg = await client.messages.create({
  model: 'claude-sonnet-4-6',
  max_tokens: 200,
  messages: [{ role: 'user', content: 'Explain monoids.' }],
});

What just happened

Your call hit the daemon at localhost:7421. The daemon pulled a pre-warmed Claude Code session from the Session Pool (~10 ms), forwarded the request through that session — which authenticates against your Claude subscription, not an Anthropic API key — and streamed the response back as a buffered JSON message in Anthropic SDK shape.

Your application doesn't know anything has happened differently. The SDK is unchanged. Only the baseURL moved.

Next steps