Pay per token,
save by volume.
A flat discount on every provider's published rate, plus volume bonuses that stack as monthly spend grows. Built for production.
Spend more, save more
Additional points stack on top of every provider discount.
Standard
For new teams and prototypes. The published per-provider discount with zero commitment.
- 10-25% off published rates
- Pay-as-you-go, no minimums
- Unified API, request log, retries
- Basic budget alerts
Scale
For production teams. Predictable invoiced billing, guardrails, and dedicated support.
- Everything in Standard
- +3% on every provider discount
- Net-30 invoiced billing
- Cost guardrails & per-team budgets
- Shared Slack support channel
Commit
For mission-critical workloads. Custom commit pricing, dedicated capacity, and a real SLA.
- Everything in Scale
- +5-8% negotiated discount
- Annual commit, quarterly true-up
- SSO, audit log, custom DPA
- Named CSM & 99.95% SLA
Fixed flat discount per provider
Same rate for every model in that provider's catalog.
Caching, batch, tools, multimodal — line-item detailsReal workloads use more than chat completions. Click to expand.+
Prompt caching
Cached prompt reads bill at the provider's reduced cache rate; the gateway discount stacks on top.
- Anthropic cache write+25% over input
- Anthropic cache read-90% off input
- OpenAI cached input-50% off input
- Google context caching-75% off input
Batch & async jobs
For 24h-tolerant workloads. Pass X-UOU-Batch: true on supported models.
- OpenAI / Anthropic / Google batch-50% off rate
- Gateway routingfree for batch
Tool & function calls
Tool definitions and results are billed as ordinary input/output - no agent surcharge.
- Tool defs / resultsinput rate
- Parallel tool callsno surcharge
Image, audio, file inputs
Multimodal inputs convert to tokens using each provider's published formula, shown in the request log.
- Image (Anthropic)~1.6K tok / img
- Image (OpenAI high-detail)765 + 170/tile
- Audio (Gemini)~30K tok / min
- Gateway transit feenone
Estimate what you'd save
Drag the sliders to match your spend and provider mix.
Mistral & xAI not shown; contact sales for a custom mix.
Ready to estimate, then connect with a salesperson? We respond within one business day.
Talk to sales →