One library. Pennies for orchestration.

Inference Relay is licensed to enterprises running AI at scale. Pricing is sized to your fleet, your spend, and your security posture.

PATENT PENDING

One license key. Works with the inference-relay npm library (v1.0) or the standalone daemon (v1.1+) — same key, your choice.

Solo
$50/mo
All providers, fallback cascade, streaming (v1.0), auto-patch, analytics. 3,000 calls/month.
Get Started
Pro
$100/mo
15,000 calls/month. Warm process pool, advanced routing DSL, tamper-evident audit trail.
Get Started
Enterprise
Custom
Annual license. Volume and terms quoted per organization.
Routing
  • Provider cascade across Anthropic, OpenAI, Google, Together
  • Streaming with auto-patch integration
  • Advanced routing DSL (per-call provider, model, temperature)
  • Warm process pool
Fleet
  • Org-wide key management
  • Fleet policy (MDM-signed)
  • Tamper-evident audit trail
  • Usage analytics across teams and projects
Security & support
  • SSO / SCIM (planned)
  • NDA code audit on request
  • Dedicated support channel
  • Custom call volumes — sized to your fleet
Field of use. We license Inference Relay broadly across industries — healthcare, logistics, financial services, SaaS, manufacturing, media, and many others. Tell us what you’re building and we’ll work out terms that fit.
Economic Impact Analysis
Monthly API Spend
$
$0$1K$10K$50K$250K
Current Monthly Cost$15,000
Relay Monthly Cost$255
Monthly Savings$14,745
Current Gross Margin15%
Relay Gross Margin98.3%
Projected Annual Savings$176,940
Before
$15,000
After
$255
Frequently Asked Questions
Getting Started
Pricing & Business Model
Use Cases & Positioning
Technical / How It Works
Comparison
Privacy & Security
Operational
Compliance, Legal, Anthropic
Patent / IP
Talk to us about a license.
Tell us about your fleet and your current AI spend. We'll come back with a quote and a deployment plan.
Read the Docs →
or email contact@inference-relay.com