Pricing

Honest ranges. Locked rates on commitment.

Per-token rates depend on the model tier, your region, and whether you're on shared or dedicated capacity. We publish ranges so you can size a workload before talking to anyone — then quote a specific rate when the shape is clear.

Pay-as-you-go
Developer
$0/mo base
Per-token billing across the full catalog. No commitment, no minimums — start with a trial key, scale when you're ready.
  • All catalog models
  • Shared inference pools
  • Streaming, tools, vision, JSON mode
  • Email support, business hours
  • 10K req/min default ceiling
Request a trial key
Reserved capacity
Enterprise
Custommonthly term
Reserved replicas, BYO-provider, data-residency, zero-retention routes, and the paperwork your security team requires.
  • Dedicated GPU replicas
  • Custom regions & data residency
  • Zero-retention inference routes
  • BYO-provider key support
  • SOC 2 type II artifacts on request
Contact sales
Per-token ranges

Public pay-as-you-go rates.

Ranges represent the spread between regions, capacity types, and load conditions. The rate you actually pay is determined by your route and is shown in the response headers of every request.

Model Tier Context Input / 1M tokens Output / 1M tokens
Claude Opus 4.7 frontier 1M $13.00 – $18.00 $65.00 – $90.00
Gemini 3.1 Pro frontier 1M $2.20 – $3.50 $13.00 – $18.00
GPT 5.5 frontier 1M $5.00 – $8.00 $30.00 – $40.00
Claude Sonnet 4.6 balanced 1M $3.00 – $4.50 $15.00 – $20.00
GPT-5.4 mini balanced 400K $0.80 – $1.40 $4.50 – $6.50
Gemini 3 Flash balanced 1M $0.55 – $1.00 $3.00 – $4.50
Mistral Large 3 balanced 256K $0.55 – $1.20 $1.60 – $3.50
Llama 4 Maverick efficient 1M $0.20 – $0.55 $0.70 – $1.60
DeepSeek V4 efficient 1M $0.20 – $0.50 $0.80 – $1.80
Qwen3-Max efficient 256K $0.85 – $1.40 $4.00 – $5.50
Command R+ efficient 128K $2.50 – $3.50 $10.00 – $13.00
Phi-4 efficient 16K $0.14 – $0.25 $0.50 – $0.80

Ranges in USD. Committed customers see flat locked rates inside this range; the exact rate is set in your order form. Dedicated capacity is priced per replica-month and quoted separately.

Add-ons

Pricing for the things around the model.

Most workloads only need the per-token meter. These line items show up for teams who need more control over where their inference runs.

Add-on What it covers Price range
Dedicated replica Reserved GPU capacity for a single model, in a region of your choice. $3,500 – $14,000replica · month
Zero-retention route Prompt and completion bytes never leave volatile memory. Audit log only. +10 – 20%on token rates
Data-residency lock Requests pinned to a specific region with hard failover off. +5 – 12%on token rates
Premium support 24/7 paging, named contact, quarterly architecture review. $1,500 – $4,000per month
BYO-provider routing Use your own provider key under our schema, routing, and observability layer. $0.10 – $0.30per 1K requests
Pricing FAQ

Common questions from finance reviews.

Why ranges instead of single prices?
Because the rate genuinely varies. A request routed to a US-West shared pool at off-peak isn't priced the same as a Tokyo dedicated replica under peak load. Publishing one number would mean either undercharging us out of business or overcharging customers who don't need the premium path. Ranges are honest; the exact rate is in your response headers and your invoice.
How do committed rates work?
You commit to a monthly token volume (or a dollar floor) for a term of 3, 6, or 12 months. In exchange, your per-token rate is locked at a single number inside the public range for the duration. Unused commitment doesn't roll over by default; we can quote a roll-over clause if it matters to your forecasting.
How is billing measured?
Per token, as reported by the upstream provider, with no rounding-up. Cached prompt tokens are billed at the standard cached rate (usually 25–50% of input price, depending on the model). Tool-use and structured-output overhead is included in the token count; we don't charge a separate "agent surcharge."
Do you offer free credits or a trial?
Yes — every new account starts with a trial credit large enough to validate a real workload, not just hello-world. Email sales@idclinks.com with a short description of what you'd test and we'll size the credit accordingly.
What payment methods do you accept?
Credit card, ACH/wire, and invoice (net-30) for committed customers. We can support most procurement systems and master service agreements; talk to sales for the paperwork.

Want a quote for your shape?

Send us your monthly token estimate, model mix, and region. We'll come back with a locked-rate offer inside the ranges above — usually within one business day.