A unified inference gateway

One API.
Every frontier
model.

idclinks is a single endpoint for the world's most capable language and multimodal models. Switch between providers with a string, pay predictable rates, and ship without renegotiating contracts every quarter.

All systems normal v2026.05 release SOC 2 type II
Built for teams that take infrastructure seriously
SOC 2 TYPE II
GDPR · DPA
HIPAA BAA
ISO 27001*
99.95% SLA
6 REGIONS
Catalog

Discover models available on idclinks.

We test, benchmark, and route to the highest-performing inference path for each model. Filter by capability — you write the model name, we handle the rest.

20% off
g google · gemini
Gemini-3.1-Pro
o openai
GPT-5.5
10% off
a anthropic
Claude-Opus-4.7
20% off
d deepseek
DeepSeek-V4
a anthropic
Claude-Sonnet-4.6
30% off
o openai
GPT-5.4-mini
g google · gemini
Gemini-3-Flash
m meta · llama
Llama-4-Maverick
40% off
q qwen
Qwen3-Max
m mistral
Mistral-Large-3
c cohere
Command-R+
p phi
Phi-4
15% off
x diffuse
Diffuse-XL-3
r recraft
Recraft-V4
r reel
Reel-3-Director
v voicepro
VoicePro-9
99.95%
Endpoint uptime, T90
~120ms
P50 time-to-first-token
40+
Models, one schema
6
Routed regions worldwide
Why teams pick idclinks

Boring infrastructure for an exciting industry.

We don't make the models. We make sure the models you depend on stay reachable, predictable, and accountable — so your roadmap isn't a hostage to anyone else's quota.

01 / 04

One schema across every provider.

OpenAI-compatible REST and streaming. Swap models with a string — the same code path works for completions, tool-calling, vision, and structured JSON. Provider-specific extensions are exposed via documented headers, never silent gotchas.

02 / 04

Smart routing that respects your SLA.

Requests are routed to the lowest-latency healthy replica in your region. Saturated upstream? We failover to the next pool before your retry budget runs out. You see the same model name; we handle the plumbing.

03 / 04

Pricing that survives a finance review.

Per-token pricing in transparent ranges. Volume committed traffic gets dedicated capacity and locked rates for the term. No surprise per-region multipliers, no priority-tier upcharges hiding in a PDF.

04 / 04

Observability you'd build yourself.

Per-request traces with provider, replica, region, token counts, and latency breakdown. Stream them to your existing pipeline via webhook, S3, or OpenTelemetry — no proprietary dashboard you have to live in.

Pricing

Pay-as-you-go ranges. Committed pricing on request.

Per-token rates depend on the model tier, region, and whether you're on shared or dedicated capacity. Below are the public PAYG ranges. Committed customers see lower, locked rates — talk to sales.

Model tier Example models Input / 1M tokens Output / 1M tokens
Frontier premium Claude Opus 4.7 · Gemini 3.1 Pro · GPT 5.5 $4.50 – $7.50 $18.00 – $30.00 Details
Balanced default Claude Sonnet 4.6 · Gemini 3 Flash · GPT-5.4 mini $0.80 – $1.80 $3.00 – $6.50 Details
Efficient Llama 4 · DeepSeek V4 · Qwen3-Max $0.20 – $0.80 $0.80 – $2.40 Details
Dedicated capacity Any catalog model, reserved replicas Custommonthly term Custommonthly term Talk to sales

Ranges reflect the spread between regions and capacity tiers; the rate you pay depends on your route. Exact per-model rates and committed pricing are quoted on request.

FAQ

Things people ask before signing.

How is idclinks different from calling each provider directly?
Three things, mostly. One schema means you stop maintaining adapter code for every new release. Routing keeps your tail latency stable when an upstream brown-outs. And committed-capacity contracts give you reservable throughput that individual providers often won't sell to a single buyer below enterprise scale.
What happens to my data?
Requests are forwarded to the provider you select. We don't train on your traffic and we don't log prompt or completion content by default — only the metadata you'd expect (model, latency, token counts, status). Zero-retention routes are available for regulated workloads on request.
Which regions do you route from?
Six routed regions today: US East, US West, EU Frankfurt, Singapore, Tokyo, and Sydney. Requests are pinned to your nearest healthy region; failover is automatic and surfaced in the response headers so you always know where your tokens were generated.
Can I bring my own provider account?
Yes — BYO-key routes are supported on Enterprise plans. You keep your provider relationship and any negotiated rates; we handle the schema unification, routing, and observability layer on top.
How do I get started?
Email sales@idclinks.com with a short description of your workload — expected volume, the models you care about, and any compliance constraints. We typically respond within one business day with a trial key and a pricing sheet for your shape.

Ready to ship on one key?

Tell us about your workload — model mix, monthly tokens, region, and any compliance needs. We'll come back with a quote and a trial key.