Troubleshooting

Connection refused on 127.0.0.1:7421

The single most common failure mode for new users — the daemon process isn't running, so the bind step never happened and no Session Pool exists to serve calls.

Verify:

curl http://localhost:7421/v1/health
# curl: (7) Failed to connect to localhost port 7421: Connection refused

Causes + fixes:

  • You never launched the desktop app. On macOS: open /Applications/inference-relay.app. On Windows: launch from the Start menu. The daemon comes up automatically when the shell launches.
  • You launched the headless daemon but forgot the wizard. First-run requires the GUI to write a license key to ~/.inference-relay/settings.json. After that, headless launches work. See Headless for the wizard-bypass path.
  • The daemon crashed. Check /tmp/inference-relay.err.log (macOS launchd) or the Event Viewer (Windows). Usual causes: port-in-use, missing claude binary, expired auth.
  • Wrong port. If SR_DAEMON_PORT was set to something other than 7421 at daemon launch, your SDK needs to point at that port. Daemon logs the bind on startup; check the dashboard's daemon status or the launchd StandardOutPath file.

Port 7421 already in use

Error: failed to bind 127.0.0.1:7421: Address already in use

The daemon refuses to start when something else is listening on its port.

Find what's using it:

# macOS / Linux
lsof -i :7421
# Windows PowerShell
Get-NetTCPConnection -LocalPort 7421

Fixes:

  • Another inference-relay daemon is already running. Kill it (pkill -f inference-relay-daemon) — the new daemon will bind cleanly.
  • Unrelated process. Pick a different port: set SR_DAEMON_PORT=7422 (or any free port) in the launchd plist / Task Scheduler args / shell env, then point your SDK at the new port via baseURL: 'http://localhost:7422'.

"License invalid" / 402 Payment Required on every call

The license cache says valid: false. Check the state:

curl http://localhost:7421/v1/license

Causes:

  • Bad key in settings. ~/.inference-relay/settings.json has a typo or a key from a different account. Re-paste in the wizard.
  • License revoked server-side. Check your subscription status at inference-relay.com/dashboard.
  • Network can't reach validation endpoint. The daemon hits api.inference-relay.com every 5 minutes; outbound HTTPS must work. Verify: curl -I https://api.inference-relay.com/v1/health.
  • Cap exceeded. Response includes code: "cap_exceeded" plus topUpUrl. Hit the URL to add usage.

"claude not found" / availability: {ok: false}

The daemon can't locate the claude binary on the host.

curl http://localhost:7421/v1/availability
# {"ok": false, "missing": ["claude"]}

Fix:

# macOS / Linux
npm install -g @anthropic-ai/claude-code
claude login

# Windows
irm https://claude.ai/install.ps1 | iex
claude login

The daemon searches PATH, %APPDATA%\npm\, ~/.bun/bin/, and the VSCode extension bundle. If your install is in an unusual location (e.g., a corporate-pinned path), symlink it to one of those.

Claude is logged in but calls hang / return auth errors

The daemon drove claude via PTY and claude itself prompted for login. Two ways this happens:

  • Session expired while the daemon was running. The daemon doesn't detect this and won't auto-relogin. Recovery: open a terminal, run claude interactively, complete the login flow, then restart the daemon.
  • Multiple claude installs on PATH with different auth states. The daemon picks the first one it finds; if that's a stale install (e.g., a VSCode-bundled copy without your subscription tokens), calls fail. Verify which binary the daemon is using: which claude from the same shell that launched the daemon.

Response contains chrome / startup banner instead of model output

If you see "Claude Code v..." or "Welcome to Claude" inside content, the turn extractor didn't anchor correctly. This was a v1.1.5 bug fixed in v1.1.7; if you're on the right version it shouldn't happen.

Fix:upgrade to v1.1.9+. The daemon's auto-updater applies updates on every 4-hour poll or via the tray menu's "Check for updates."

Calls return blank content + output_tokens: 0

Two known causes:

  • First call to cold daemon. Session Pool empty → synchronous PTY spawn → ~2 s overhead → claude finished but the SDK timed out. Verify by issuing a warmup call after daemon launch and checking GET /v1/sessions to confirm idle >= 1.
  • Old claude version with unusual TUI glyphs. The daemon parses / / block markers; very old claude versions emit different characters. Upgrade claude to the latest: npm update -g @anthropic-ai/claude-code.

stream: true returns a single buffered JSON response

Not a bug — SSE streaming is parsed but not implemented in v1.1. The daemon sees stream: trueon the request, accepts the field, and returns the full response buffered when the model finishes. SDK streaming wrappers (Python's client.messages.stream(...), TypeScript's for await over the stream object) will receive the entire response in a single chunk.

Native SSE is on the v1.2 roadmap. Until then, use buffered responses and chunk client-side if you need progressive UI.

License activation failed during the Setup Wizard

The wizard validates the license by POSTing to api.inference-relay.com/v1/license. Failure modes:

  • "Validation failed — check that the key is correct" — server returned valid: false. Causes: typo, key from a different account, key was revoked. Verify at inference-relay.com/dashboard.
  • Network error / hang — outbound HTTPS to api.inference-relay.com is blocked. Test from the same shell: curl -I https://api.inference-relay.com/v1/health. Corporate proxies and offline VMs are common culprits.
  • "License invalid: cap exceeded" during activation — the subscription is past its monthly call cap. Top up via the dashboard link the wizard surfaces.
  • No response at all — the daemon's license-refresh task hasn't run yet. Wait 5 seconds and click Re-check; if still no response, restart the app.

Streaming events arrive out of order

Shouldn't happen. The daemon proxies claude's stream-json output in-order through SSE. If you see it, recent-calls.jsonlwill have the daemon-side ordering; compare to your SDK consumer's observed ordering. Open an issue at inference-relay support with both timelines.

Auto-update failed / daemon reverted to old version

The Tauri shell's updater plugin downloads .app.tar.gz (macOS) or .nsis.zip (Windows) signed with ed25519, verifies against the embedded pubkey, then applies. Failure modes:

  • Network down at update time — the shell retries on next launch.
  • Pubkey mismatch (rare) — happens only if the signing key was rotated server-side without a corresponding shell release. Manual reinstall from /changelog is the recovery path.
  • In-flight calls during update — the updater waits for active calls to finish before swapping the binary. If a call is stuck, the update defers indefinitely. Restart the daemon to unstick.

Two daemons accidentally running

Symptoms: intermittent 404s or behavior changes per call.

# macOS / Linux
pgrep -fl inference-relay-daemon
# Windows
Get-Process | ? { $_.Name -like "*inference-relay*" }

If you see more than one PID, one of them won the port bind and the other is failing silently. Kill all of them, then launch one deliberately.

Where to go next

  • Daemon install / uninstall / lifecycleDaemon Lifecycle
  • Security model / what the daemon does and doesn't seeSecurity
  • Inspect recent callsGET /v1/recent-calls (see API Reference)