Security
Threat model
inference-relay sits between your application and your Claude Code subscription. It is a dumb pipe: it accepts Anthropic-shape HTTP calls, translates them through a managed PTY to your claudesubprocess, and returns the response. It executes no business logic, makes no scheduling decisions, applies no policy beyond license and cap gating. The same dumb-pipe property that v1.0 documented under "Logic Synchronization Security" applies to v1.1 verbatim.
What inference-relay does see: the request body of every /v1/messages call (prompts, attachments, tool definitions, response content blocks). These flow through process memory and never persist to disk except as metadata-only entries in recent-calls.jsonl (below).
What inference-relay does not see: anything your claudeinstall has access to outside the call. No auth tokens, no shell environment beyond what's needed to spawn claude, no file-system reads beyond the attachment tempfile dir.
What api.inference-relay.com (the license backend) sees: license key validation requests, signed usage events (call counts, token counts, durations — no prompt content), update manifest requests. Every server-to-client response is RS256-signed; the client rejects unsigned responses to prevent MITM.
Six security axes
1. Daemon binary integrity
The daemon binary distributed via /changelog/<version> ships as a release artifact. Updates are delivered as .app.tar.gz (macOS) or .nsis.zip (Windows) signed with an ed25519 keypair held by the inference-relay release infrastructure. The Tauri shell embeds the public half of the keypair at compile time; every update bundle is verified against the embedded pubkey before the binary on disk gets replaced.
The Tauri auto-updater refuses to apply unsigned or mismatched updates. Lost-pubkey rotation (e.g., if the signing key gets compromised) requires a forced manual reinstall — a known constraint documented in Troubleshooting.
2. License trust chain (RS256)
License validation runs against api.inference-relay.com/v1/license. The server returns an RS256-signed JWS body; the client decodes it using a public key baked into the daemon binary at compile time.
If a man-in-the-middle returns {"valid": true} over plain JSON, the client rejects it — only signed responses are honored. The same posture v1.0 ships with, unchanged.
The license key itself is stored at ~/.inference-relay/settings.json with file mode 0600 on Unix. License keys never appear in recent-calls.jsonl (redacted to ••••••••<last-4>), debug-bundle exports, or daemon stderr.
3. Subprocess isolation
The daemon spawns claude as a child process inside a managed PTY. The subprocess inherits a deliberately narrow environment: TERM=xterm-256color, COLORTERM=truecolor, the user's PATH (so claudecan find its dependencies), and nothing else from the daemon's process environment.
The subprocess can write to its own working directory and tempfile dirs. It does NOT inherit the daemon's keyring, ssh agent, or any secrets the daemon never reads anyway. If your claude install is authenticated to an Anthropic subscription, the auth lives in claude's own credential store on disk; the daemon never reads it.
4. Attachment isolation
Vision and document content blocks land at $TMPDIR/subscription-relay-attachments/ with UUID-prefixed names. File mode is the system default (typically 0600 on user-installed binaries). Files are unlinked when the call returns; orphans (from aborted calls or crashes) are swept every 15 minutes by a background loop that deletes anything older than 1 hour.
Attachment bytes never persist in recent-calls.jsonl — only mime type, size, and a content hash are logged. The raw bytes exist in memory for the duration of the call and on disk for the duration of the tempfile lifecycle.
5. MITM protection
Three TLS-protected endpoints carry the daemon's outbound traffic:
api.inference-relay.com/v1/license— RS256-signed responseapi.inference-relay.com/v1/events— license-validated POSTapi.inference-relay.com/v1/desktop/update/...— ed25519-signed bundle URLs
Each rejects unsigned bodies. Each pins TLS to the production certificate chain. A proxy that intercepts and re-signs would need both the RS256 private key (server-side, never on a customer machine) and the ed25519 update key (release infrastructure, never on a customer machine). The threat model assumes those keys are not exposed.
6. Local network exposure
The daemon binds 127.0.0.1:7421. Loopback only. Nothing on your local network or the internet can connect to it. If you need multi-machine access (e.g., a CI runner calling a developer laptop), you'd front the daemon with an authenticated reverse proxy — but the daemon itself does not accept non-loopback connections.
Persistence inventory
Where state lives on disk:
~/.inference-relay/settings.json— License key (0600), working dir, port override. Persistence: Forever.~/.inference-relay/recent-calls.jsonl— Call metadata, last 1000 calls. Persistence: Append-only, capped.~/.inference-relay/mcp-state/— Per-session MCP IPC tools files. Persistence: Per-session lifetime.~/.inference-relay/pty-pids.json— Process registry for orphan cleanup. Persistence: Across daemon restarts.$TMPDIR/subscription-relay-attachments/— Per-call attachment payloads. Persistence: Call lifetime + 1h max.
Nothing in registries, system services, or /usr/local/lib/. Uninstalling removes the binary; wiping ~/.inference-relay/ removes all state.
What's not yet covered
- Code signing on macOS (Apple Developer ID + notarization) — on the roadmap. v1.1 ships Gatekeeper-warned.
- EV code signing on Windows — on the roadmap. v1.1 ships SmartScreen-warned.
- Reproducible builds — daemon binaries are not yet reproducible-build verified. v1.0 documented a compile-time-proof posture; v1.1's Rust port has the substrate to support reproducible builds but the CI doesn't enforce it yet.
- SOC 2 / ISO 27001 compliance — the inference-relay backend doesn't currently carry these certifications. For regulated workloads, the Enterprise page covers the hosted-deploy alternative.
Audit interface
For security review or post-incident analysis:
# Snapshot of daemon state (license-redacted, attachment-byte-free) curl http://localhost:7421/v1/debug-bundle > debug.json # Full call history curl http://localhost:7421/v1/recent-calls | jq . # Current license + cap status curl http://localhost:7421/v1/license
The debug-bundle endpoint is intended for support tickets and auditors; nothing in it identifies prompt or response content.
Where to go next
- Architectural deep-dive → Whitepaper (v1.0 page; applies to both versions)
- Compliance positioning → Enterprise
- Daemon failure modes → Troubleshooting