Troubleshooting
Connection refused on 127.0.0.1:7421
The single most common failure mode for new users — the daemon process isn't running, so the bind step never happened and no Session Pool exists to serve calls.
Verify:
curl http://localhost:7421/v1/health # curl: (7) Failed to connect to localhost port 7421: Connection refused
Causes + fixes:
- You never launched the desktop app. On macOS:
open /Applications/inference-relay.app. On Windows: launch from the Start menu. The daemon comes up automatically when the shell launches. - You launched the headless daemon but forgot the wizard. First-run requires the GUI to write a license key to
~/.inference-relay/settings.json. After that, headless launches work. See Headless for the wizard-bypass path. - The daemon crashed. Check
/tmp/inference-relay.err.log(macOS launchd) or the Event Viewer (Windows). Usual causes: port-in-use, missingclaudebinary, expired auth. - Wrong port. If
SR_DAEMON_PORTwas set to something other than 7421 at daemon launch, your SDK needs to point at that port. Daemon logs the bind on startup; check the dashboard's daemon status or the launchdStandardOutPathfile.
Port 7421 already in use
Error: failed to bind 127.0.0.1:7421: Address already in use
The daemon refuses to start when something else is listening on its port.
Find what's using it:
# macOS / Linux lsof -i :7421 # Windows PowerShell Get-NetTCPConnection -LocalPort 7421
Fixes:
- Another inference-relay daemon is already running. Kill it (
pkill -f inference-relay-daemon) — the new daemon will bind cleanly. - Unrelated process. Pick a different port: set
SR_DAEMON_PORT=7422(or any free port) in the launchd plist / Task Scheduler args / shell env, then point your SDK at the new port viabaseURL: 'http://localhost:7422'.
"License invalid" / 402 Payment Required on every call
The license cache says valid: false. Check the state:
curl http://localhost:7421/v1/license
Causes:
- Bad key in settings.
~/.inference-relay/settings.jsonhas a typo or a key from a different account. Re-paste in the wizard. - License revoked server-side. Check your subscription status at
inference-relay.com/dashboard. - Network can't reach validation endpoint. The daemon hits
api.inference-relay.comevery 5 minutes; outbound HTTPS must work. Verify:curl -I https://api.inference-relay.com/v1/health. - Cap exceeded. Response includes
code: "cap_exceeded"plustopUpUrl. Hit the URL to add usage.
"claude not found" / availability: {ok: false}
The daemon can't locate the claude binary on the host.
curl http://localhost:7421/v1/availability
# {"ok": false, "missing": ["claude"]}Fix:
# macOS / Linux npm install -g @anthropic-ai/claude-code claude login # Windows irm https://claude.ai/install.ps1 | iex claude login
The daemon searches PATH, %APPDATA%\npm\, ~/.bun/bin/, and the VSCode extension bundle. If your install is in an unusual location (e.g., a corporate-pinned path), symlink it to one of those.
Claude is logged in but calls hang / return auth errors
The daemon drove claude via PTY and claude itself prompted for login. Two ways this happens:
- Session expired while the daemon was running. The daemon doesn't detect this and won't auto-relogin. Recovery: open a terminal, run
claudeinteractively, complete the login flow, then restart the daemon. - Multiple
claudeinstalls on PATH with different auth states. The daemon picks the first one it finds; if that's a stale install (e.g., a VSCode-bundled copy without your subscription tokens), calls fail. Verify which binary the daemon is using:which claudefrom the same shell that launched the daemon.
Response contains chrome / startup banner instead of model output
If you see "Claude Code v..." or "Welcome to Claude" inside content, the turn extractor didn't anchor correctly. This was a v1.1.5 bug fixed in v1.1.7; if you're on the right version it shouldn't happen.
Fix:upgrade to v1.1.9+. The daemon's auto-updater applies updates on every 4-hour poll or via the tray menu's "Check for updates."
Calls return blank content + output_tokens: 0
Two known causes:
- First call to cold daemon. Session Pool empty → synchronous PTY spawn → ~2 s overhead → claude finished but the SDK timed out. Verify by issuing a warmup call after daemon launch and checking
GET /v1/sessionsto confirmidle >= 1. - Old claude version with unusual TUI glyphs. The daemon parses
⏺/●/•block markers; very oldclaudeversions emit different characters. Upgradeclaudeto the latest:npm update -g @anthropic-ai/claude-code.
stream: true returns a single buffered JSON response
Not a bug — SSE streaming is parsed but not implemented in v1.1. The daemon sees stream: trueon the request, accepts the field, and returns the full response buffered when the model finishes. SDK streaming wrappers (Python's client.messages.stream(...), TypeScript's for await over the stream object) will receive the entire response in a single chunk.
Native SSE is on the v1.2 roadmap. Until then, use buffered responses and chunk client-side if you need progressive UI.
License activation failed during the Setup Wizard
The wizard validates the license by POSTing to api.inference-relay.com/v1/license. Failure modes:
- "Validation failed — check that the key is correct" — server returned
valid: false. Causes: typo, key from a different account, key was revoked. Verify atinference-relay.com/dashboard. - Network error / hang — outbound HTTPS to
api.inference-relay.comis blocked. Test from the same shell:curl -I https://api.inference-relay.com/v1/health. Corporate proxies and offline VMs are common culprits. - "License invalid: cap exceeded" during activation — the subscription is past its monthly call cap. Top up via the dashboard link the wizard surfaces.
- No response at all — the daemon's license-refresh task hasn't run yet. Wait 5 seconds and click Re-check; if still no response, restart the app.
Streaming events arrive out of order
Shouldn't happen. The daemon proxies claude's stream-json output in-order through SSE. If you see it, recent-calls.jsonlwill have the daemon-side ordering; compare to your SDK consumer's observed ordering. Open an issue at inference-relay support with both timelines.
Auto-update failed / daemon reverted to old version
The Tauri shell's updater plugin downloads .app.tar.gz (macOS) or .nsis.zip (Windows) signed with ed25519, verifies against the embedded pubkey, then applies. Failure modes:
- Network down at update time — the shell retries on next launch.
- Pubkey mismatch (rare) — happens only if the signing key was rotated server-side without a corresponding shell release. Manual reinstall from
/changelogis the recovery path. - In-flight calls during update — the updater waits for active calls to finish before swapping the binary. If a call is stuck, the update defers indefinitely. Restart the daemon to unstick.
Two daemons accidentally running
Symptoms: intermittent 404s or behavior changes per call.
# macOS / Linux
pgrep -fl inference-relay-daemon
# Windows
Get-Process | ? { $_.Name -like "*inference-relay*" }If you see more than one PID, one of them won the port bind and the other is failing silently. Kill all of them, then launch one deliberately.
Where to go next
- Daemon install / uninstall / lifecycle → Daemon Lifecycle
- Security model / what the daemon does and doesn't see → Security
- Inspect recent calls →
GET /v1/recent-calls(see API Reference)