Headless Operation
The model
The Setup Wizard runs once. After it writes ~/.inference-relay/settings.json with a validated license key, the daemon binary itself has everything it needs to serve. The Tauri shell is just a host for the wizard and the dashboard renderer; closing it does not stop the daemon.
This is the load-bearing distinction between inference-relay v1.1 and v1.0: v1.0 was an npm module that ran inside your Node process. v1.1 is a standalone binary that runs anywhere a Unix-like process can run, without a UI.
After first-run setup, the daemon executable lives at a stable path:
- macOS:
/Applications/inference-relay.app/Contents/MacOS/inference-relay-daemon - Windows:
C:\Users\<user>\AppData\Local\inference-relay\inference-relay-daemon.exe - Linux:
~/.local/share/inference-relay/inference-relay-daemon(roadmap)
You can launch the daemon directly without the Tauri shell.
Run headless on macOS
Foreground (for debugging)
/Applications/inference-relay.app/Contents/MacOS/inference-relay-daemon
The daemon writes logs to stderr and binds 127.0.0.1:7421. Ctrl-C stops it.
Background via launchd (auto-start on login)
Two prerequisites first. Apple's Gatekeeper quarantines new .app bundles with an extended attribute that fails silently under launchd — interactive double-click prompts the user, but a launchd-spawned process gets killed without surfacing an error. Strip the attribute before configuring launchd:
xattr -dr com.apple.quarantine /Applications/inference-relay.app
Then save as ~/Library/LaunchAgents/com.inference-relay.daemon.plist:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>Label</key><string>com.inference-relay.daemon</string>
<key>ProgramArguments</key>
<array>
<string>/Applications/inference-relay.app/Contents/MacOS/inference-relay-daemon</string>
</array>
<key>RunAtLoad</key><true/>
<key>KeepAlive</key><true/>
<key>ProcessType</key><string>Background</string>
<key>EnvironmentVariables</key>
<dict>
<!-- Override the default :7421 by setting SR_DAEMON_PORT here.
Inherited shell env is NOT visible to launchd-spawned processes,
so this is the only way to pin a non-default port headless. -->
<!-- <key>SR_DAEMON_PORT</key><string>7422</string> -->
</dict>
<key>StandardOutPath</key><string>/tmp/inference-relay.out.log</string>
<key>StandardErrorPath</key><string>/tmp/inference-relay.err.log</string>
</dict>
</plist>ProcessType=Background gives the daemon the right macOS QoS (low-priority, long-running) so it stays out of the way of foreground applications.
Load it:
launchctl load ~/Library/LaunchAgents/com.inference-relay.daemon.plist
The daemon now starts at login, restarts on crash, logs to /tmp/inference-relay.{out,err}.log. Stop with:
launchctl unload ~/Library/LaunchAgents/com.inference-relay.daemon.plist
Run headless on Windows
Foreground (for debugging)
C:\Users\PC\AppData\Local\inference-relay\inference-relay-daemon.exe
Background via PowerShell (no window)
Start-Process -FilePath "C:\Users\PC\AppData\Local\inference-relay\inference-relay-daemon.exe" -WindowStyle Hidden
The daemon comes up on :7421 in ~1 second.
Auto-start on logon via Task Scheduler
schtasks /create /tn "Inference Relay Daemon" ` /tr "C:\Users\PC\AppData\Local\inference-relay\inference-relay-daemon.exe" ` /sc onlogon /rl highest /f
The daemon now starts at every logon. Inspect / remove:
schtasks /query /tn "Inference Relay Daemon" schtasks /delete /tn "Inference Relay Daemon" /f
Defender note. Windows Defender applies stricter heuristic scoring to unsigned binaries launched from Task Scheduler than to interactive launches. If the first auto-start fails silently, check Defender quarantine — you may need to add the daemon path to the exclusion list while the v1.1 release ships unsigned. Signed releases are on the roadmap; see Daemon Lifecycle.
Verify
Hit the daemon's health endpoint from any process on the same machine:
curl http://localhost:7421/v1/health
# {"port":7421,"sessions":0,"status":"healthy","uptime":3128}Hit /v1/license to confirm the license is still valid:
curl http://localhost:7421/v1/license | jq .license.valid # true
Issue a real call to confirm the Pool warms up:
curl -X POST http://localhost:7421/v1/messages \
-H "Content-Type: application/json" \
-d '{"model":"claude-sonnet-4-6","max_tokens":50,
"messages":[{"role":"user","content":"PROBE_OK"}]}' | jq .contentOperational notes
The daemon binds 127.0.0.1:7421 by default. If something else is already listening on 7421, it exits with a bind error during start. Override the port by setting SR_DAEMON_PORT=<port> in the environment that spawns the daemon — for headless installs, that means the launchd plist EnvironmentVariables block or the Task Scheduler /tn /tr ... /ri argument. Your SDK then points at the new port via the same baseURL override.
The Session Pool starts empty on cold launch. The first /v1/messages call spawns a fresh PTY synchronously (~2 seconds for claude to reach the idle prompt) and is therefore slower than steady-state. Subsequent calls reuse Pre-warmed PTYs at ~10 ms grab latency. For latency-sensitive workloads, issue a single warmup call after the daemon starts so the Pool has time to fill before real traffic.
License re-validation runs in a background loop every 5 minutes against api.inference-relay.com. No user interaction needed. Long-running headless installs survive token expiry transparently — provided the host has outbound HTTPS to the validation endpoint.
Every request lands in ~/.inference-relay/recent-calls.jsonl (one line per call, license key redacted). The file is append-only and caps at 1000 entries; older calls scroll out. Useful for replay, debugging, and post-incident timeline reconstruction.
When NOT to run headless
- First-time setup. The wizard needs to write the license file. You can scriptize this by writing
~/.inference-relay/settings.jsondirectly withlicenseKey: "ir_live_..."and letting the daemon validate on start, but the wizard is faster on the first machine. - Auto-updates require the Tauri shell. The shell runs Tauri's updater plugin and applies signed updates on launch. Headless daemons receive update notifications but don't self-apply — you'd reinstall manually or invoke the shell briefly to take an update. See Daemon Lifecycle.