Two ways to pay less.

Per-token API discounts for teams. Flat subscriptions for individuals. Same models, same API - both cheaper than going direct.

For teams · API gateway

Pay per token,
save by volume.

A flat discount on every provider's published rate, plus volume bonuses that stack as monthly spend grows. Built for production.

Volume tiers

Spend more, save more

Additional points stack on top of every provider discount.

Pay as you go

Standard

For new teams and prototypes. The published per-provider discount with zero commitment.

  • 10-25% off published rates
  • Pay-as-you-go, no minimums
  • Unified API, request log, retries
  • Basic budget alerts
+0%
stacked discount
Monthly spend
$0 - $5,000
Start free
Most teams

Scale

For production teams. Predictable invoiced billing, guardrails, and dedicated support.

  • Everything in Standard
  • +3% on every provider discount
  • Net-30 invoiced billing
  • Cost guardrails & per-team budgets
  • Shared Slack support channel
+3%
stacked discount
Monthly spend
$5,000 - $50,000
Talk to sales
Enterprise

Commit

For mission-critical workloads. Custom commit pricing, dedicated capacity, and a real SLA.

  • Everything in Scale
  • +5-8% negotiated discount
  • Annual commit, quarterly true-up
  • SSO, audit log, custom DPA
  • Named CSM & 99.95% SLA
+5-8%
stacked discount
Monthly spend
$50,000+
Contact us
Per-provider rates

Fixed flat discount per provider

Same rate for every model in that provider's catalog.

A
Anthropic
Claude family — long-context reasoning, careful coding agents, multimodal vision.
15%
AI
OpenAI
GPT and o-series — flagship multimodal models and dedicated reasoning engines.
12%
G
Google
Gemini family — 1M+ context, native multimodal, fast pricing tier.
10%
DS
DeepSeek
V3 / R1 / Coder — strong reasoning at deep-discount rates.
22%
M
Mistral
Large / Medium / Codestral / Ministral — European general-purpose models.
15%
X
xAI
Grok 4 / Grok 4 mini / Grok 3 — fresh-data reasoning models.
12%
Caching, batch, tools, multimodal — line-item details
Real workloads use more than chat completions. Click to expand.
+

Prompt caching

Cached prompt reads bill at the provider's reduced cache rate; the gateway discount stacks on top.

  • Anthropic cache write+25% over input
  • Anthropic cache read-90% off input
  • OpenAI cached input-50% off input
  • Google context caching-75% off input

Batch & async jobs

For 24h-tolerant workloads. Pass X-UOU-Batch: true on supported models.

  • OpenAI / Anthropic / Google batch-50% off rate
  • Gateway routingfree for batch

Tool & function calls

Tool definitions and results are billed as ordinary input/output - no agent surcharge.

  • Tool defs / resultsinput rate
  • Parallel tool callsno surcharge

Image, audio, file inputs

Multimodal inputs convert to tokens using each provider's published formula, shown in the request log.

  • Image (Anthropic)~1.6K tok / img
  • Image (OpenAI high-detail)765 + 170/tile
  • Audio (Gemini)~30K tok / min
  • Gateway transit feenone
Savings calculator

Estimate what you'd save

Drag the sliders to match your spend and provider mix.

Monthly spend$5,000
Provider mix · auto-normalized
AAnthropic35%
AIOpenAI30%
GGoogle20%
DSDeepSeek15%

Mistral & xAI not shown; contact sales for a custom mix.

Your savings
Vs. paying providers directly
$858/ mo
$10,290 over a year
Effective discount
17.2%
You'd pay
$4,143

Ready to estimate, then connect with a salesperson? We respond within one business day.

Talk to sales
For individuals · Subscription + PAYG

Flat monthly fee,
or 10% off PAYG.

Like Claude Pro or ChatGPT Plus, but routed through one API across the three major providers. Allowance billed at official rates.

Subscriptions

Three plans, monthly fee

Daily & weekly limits keep usage predictable. No hourly cliff.

Free

Trial

Kick the tires on a small set of models, no card required.

  • Haiku, GPT-4o mini, Gemini Flash-Lite
  • 5 RPM · 20K TPM
  • Daily: 25 requests
  • No PAYG top-up · no flagship models
$0
Trial credit
$1.00
official-rate
Start free
Most popular

Plus

For solo devs and indie projects. All major models, smooth daily limits.

  • All Anthropic, OpenAI & Google models
  • 30 RPM · 100K TPM
  • Daily $0.75 · Weekly $4
  • PAYG fallback (10% off) after allowance
$10/ mo
Included
$10$15
+50% value
Get Plus
Power user

Pro

For heavy personal builds. Higher limits and headroom on premium models.

  • All Anthropic, OpenAI & Google models
  • 60 RPM · 200K TPM
  • Daily $2 · Weekly $10
  • Priority routing on flagship models
$25/ mo
Included
$25$40
+60% value
Get Pro
Per-provider rates

Flat 10% off, all available providers

Subscription allowance bills at official rates; overflow and PAYG bill at 10% off. DeepSeek, Mistral & xAI are team-only for now.

A
Anthropic
All Claude models · in + out
10%
AI
OpenAI
All GPT and o-series models
10%
G
Google
Gemini family · text + multimodal
10%
DS
DeepSeek
Team plans only
M
Mistral
Team plans only
X
xAI
Team plans only
How rate & spend limits work
No hourly cliff. Daily and weekly caps keep usage predictable.
+
  • Hourly limit— none —
  • Daily limit (Plus / Pro)$0.75 / $2.00
  • Weekly limit (Plus / Pro)$4.00 / $10.00
  • After allowance exhaustedPAYG, 10% off
  • Hard spend capUser-set, default $50/mo
Alternative

Personal PAYG estimator

No subscription, no minimum. Drag the slider to see what you'd pay.

Monthly usage · official rates$50
What's included
  • No subscription commitment
  • Flat 10% off official rates
  • Same logs, keys, and provider routing
Your PAYG cost
Flat 10% off official rates
$45/ mo
You'd save $5 / mo vs paying providers direct
Effective discount
10.0%
Default cap
$50 / mo

Ready to start small? Sign up, get $1 in trial credit, no card needed.

Start free
Common questions

FAQ.

How are discount rates calculated and where do they come from?+

We negotiate volume rates directly with each provider and pass the bulk of that delta back as a flat discount on the public per-token price. The rate is fixed per provider and applies to both input and output tokens.

What's the difference between the Plus subscription and Personal PAYG?Individual+

A subscription is cheaper when you use the gateway regularly. PAYG makes sense if your usage is irregular or very small.

What are the daily / weekly limits on subscription plans?Individual+

Both Plus and Pro enforce daily and weekly spending caps so usage stays roughly proportional through the month. There is intentionally no hourly limit.

Why is the PAYG discount lower than what teams get?Individual+

Personal accounts have more operational overhead per dollar of revenue and lower aggregate volume. The 10% flat rate covers that overhead while still landing below the official provider price.

Do volume tier bonuses stack on the provider discount?Team+

Yes. Scale (+3%) on Anthropic (15%) = 18% effective discount across all Claude models. Enterprise commit tiers are negotiated based on annual volume.

Is there a markup or hidden fee on top of the model rate?+

No. Your invoice line items show provider rate x (1 - discount). There is no per-request fee, no egress fee, and no routing surcharge.

How is billing handled? Prepaid, postpaid, invoiced?Team+

Pay-as-you-go is prepaid via card. Above $5K/mo we can move you to monthly invoiced billing with Net-30 terms.

What's the discount on cached, batch, or prompt-cached tokens?+

Provider's reduced rate applies first, then our flat discount stacks on that. A cache read on Claude effectively costs input x 0.10 x 0.85.

Are there refunds for failed requests or upstream outages?+

Requests that fail upstream before generating output tokens are not billed. Partial generations bill only for delivered tokens.

Stop overpaying
for the same models.

One key, transparent rates, every provider. Start free as an individual or talk to sales about your team's volume.