MCP Server
@inference-relay/mcp is a Model Context Protocol server that turns any MCP-compatible IDE into a live operational console for your inference-relay account. 19 tools cover financial intelligence, operational health, security verification, manifest sync, and fleet management.
Once installed, ask your IDE in plain English — “show me my inference-relay savings this week” or “verify the JWS handshake” — and the model picks the right tool automatically.
Prerequisites
- An inference-relay license key — get one at inference-relay.com/pricing
- Node.js 18+ (for the
npxinstall path) - An MCP-compatible client — Claude Code, Claude Desktop, Cursor, or any other MCP host
Setup — Claude Code (recommended)
The claude mcp add command writes the MCP entry for you, no JSON editing required. Run from your terminal:
claude mcp add inference-relay \ --env IR_LICENSE_KEY=ir_live_xxxx \ -- npx -y @inference-relay/mcp
Replace ir_live_xxxx with your real key. Then verify the entry:
claude mcp get inference-relay
You should see Status: ✓ Connected and an Environment block with your IR_LICENSE_KEY.
Setup — Claude Desktop / Cursor / other MCP clients
Edit your client's MCP config file. On macOS, Claude Desktop's lives at:
~/Library/Application Support/Claude/claude_desktop_config.json
Add this entry under mcpServers (create the section if it doesn't exist):
{
"mcpServers": {
"inference-relay": {
"command": "npx",
"args": ["-y", "@inference-relay/mcp"],
"env": {
"IR_LICENSE_KEY": "ir_live_xxxx"
}
}
}
}The -y flag is important. Without it, npxblocks waiting for “OK to install? (y/N)” on first run and the MCP stdio handshake hangs. The -y flag answers yes automatically.
Verify the install
Restart your IDE (the MCP server is loaded at session start), then ask:
List the inference-relay MCP tools you have access to.
You should see 19 tool names. If you see fewer, the MCP server isn't connecting — check the command and args in your config.
You can also verify your license key independently with the bundled CLI:
IR_LICENSE_KEY=ir_live_xxxx npx inference-relay verify
[ir] IDENTITY: VALIDATED
[ir] TIER: PRO
[ir] RELAY: ACTIVE
Claude CLI: v2.1.96
Tier: max
Usage: 19 / 5,000 (0.4%)Try these prompts
The model will pick the right tool for each prompt automatically:
- “Show me inference-relay model pricing.” —
get_model_pricing(no auth, instant) - “Verify inference-relay privacy integrity.” —
verify_privacy_integrity - “What's my inference-relay usage cap?” —
monitor_usage_caps - “Validate the inference-relay JWS handshake.” —
validate_jws_handshake - “Show me my inference-relay fleet status.” —
get_fleet_status - “Show me inference-relay savings this month.” —
get_savings_summary
The 19 tools
Financial Intelligence (5)
get_savings_summary— Real-time P&L: relay cost vs direct API costanalyze_workflow_efficiency— Per-model ROI ranking from recent eventsget_projected_burn— Monthly forecast: with relay vs without relaymonitor_usage_caps— Current usage against tier cap with gaugeget_model_pricing— Per-model pricing table for all providers
Operational Health (5)
probe_provider_availability— Status grid for CLI / API / Ollamaget_duration_benchmarks— p50 / p95 / p99 latency per providerlist_fallback_events— Recent cascade events with failure reasonsexplain_fallback_chain— Plain English narration of a fallback chainprobe_environment— Bloomberg-style environment diagnostic
Security & Compliance (4)
verify_privacy_integrity— Compile-time guarantee that audit events never carry prompt contentget_audit_trail— Audit entries with optional SHA-256 hash chain verificationscan_leak_telemetry— Runtime scan for anomalous telemetry stringsvalidate_jws_handshake— RS256 verification of/v1/validateand/v1/manifest
Manifest Sync (2)
check_manifest_sync— Compare local LAST_KNOWN_GOOD against remote manifestsimulate_cli_drift— What-if: simulate a manifest field change and report impact
Fleet Management (3)
rotate_relay_keys— Generate new key, revoke old (Pro / Enterprise)get_fleet_status— Status table for all keys in your fleetget_activity_log— Operational activity log with filtering by event type
Troubleshooting
MCP server fails to connect
The most common cause is a missing -y flag in args. Without it, npx blocks waiting for install confirmation and the MCP stdio handshake times out. Use npx -y @inference-relay/mcp.
License-gated tools say IR_LICENSE_KEY is required
The envblock in your MCP config wasn't picked up. For Claude Code, verify with claude mcp get inference-relay — the output should include an Environment section. If it doesn't, remove and re-add with the --env flag.
Tools return “401 Invalid license key”
Either the key is wrong, expired, or revoked. Run npx inference-relay verify to test the key independently of MCP, or check the dashboard at inference-relay.com/dashboard.
Want to skip the npx slowness
Pre-install the MCP package globally and use the binary directly:
npm install -g @inference-relay/mcp
Then in your MCP config:
{
"mcpServers": {
"inference-relay": {
"command": "inference-relay-mcp",
"env": {
"IR_LICENSE_KEY": "ir_live_xxxx"
}
}
}
}Next Steps
- Getting Started — Install the core inference-relay library
- Security Model — How JWS handshake validation protects against MITM
- Fallback Behavior — What the operational health tools surface
- Dashboard — The same data the MCP tools query, but in a browser