Products — idclinks

Frontier tier · premium reasoning

When the answer matters more than the dollar.

Top-of-stack models for complex reasoning, long-form synthesis, and high-stakes tool-using agents. Priced higher per token, but the per-task economics often beat the mid-tier once you account for retry rates and human review.

G

Gemini 3.1 Pro

Frontier · 1M ctx · multimodal

One-million-token context window with strong native multimodal understanding. Our preferred choice for repository-scale code review and long PDF analysis.

text vision audio tools 1M context

5

GPT 5.5

Frontier · 1M ctx · generalist

The pragmatic default. Best-in-class structured outputs, mature function calling, and the most predictable latency profile under bursty traffic.

text vision tools json mode

C

Claude Opus 4.7

Frontier · 1M ctx · long-form

Strongest long-form reasoning and instruction adherence we test against. Pairs well with agentic frameworks and is our top pick for code-writing pipelines.

text vision tools artifacts

Balanced tier · production workhorses

Where most of your traffic should land.

Frontier-adjacent quality at a fraction of the per-token cost. These are the models we recommend for the bulk of customer-facing surfaces — chat, drafting, summarization, retrieval-augmented Q&A.

F

Gemini 3 Flash

Balanced · 1M ctx · fast

Long-context understanding at sub-second TTFT. Excellent for streaming UIs where users see the first token before they finish breathing.

text vision audio tools fast

m

GPT-5.4 mini

Balanced · 400K ctx · efficient

The same generalist instincts as its bigger sibling, tuned for lower cost and higher throughput. Solid retry target when frontier capacity is saturated.

text vision tools

S

Claude Sonnet 4.6

Balanced · 1M ctx · agentic

A favorite for agent loops — strong tool-use, low refusal-rate on ambiguous prompts, and very stable behavior across long multi-turn sessions.

text vision tools agents

Efficient tier · open weights & specialists

Volume work, settled economics.

When unit economics decide whether a feature ships at all. Open-weights flagships and purpose-built specialists, available on shared capacity or as reserved replicas.

L

Llama 4 Maverick

Open · 1M ctx · flagship

Open-weights flagship with strong general capability. Choose dedicated replicas for data-residency, or share capacity for unbeatable per-token economics.

text vision open-weights dedicated

D

DeepSeek V4

Specialist · 1M ctx · code & math

Punches above its price on code and math benchmarks. The right call for cost-sensitive developer-tooling pipelines that still demand top-decile output.

text code value

Q

Qwen3-Max

Multilingual · 256K ctx · open

Strongest multilingual coverage in the catalog, especially for Chinese, Japanese, and Korean. Pairs well with vision tasks for cross-border consumer surfaces.

text multilingual

M

Mistral Large 3

Balanced · 256K ctx · EU-hosted

European-hosted option with reliable instruction following and competitive pricing. Often the simplest answer for EU data-residency requirements.

text tools eu region

R

Command R+

Specialist · 128K ctx · RAG

Purpose-built for retrieval-augmented generation. Excellent citation behavior and grounded responses make it our top pick for knowledge-base applications.

text rag citations

P

Phi-4

Compact · 16K ctx · edge

A small-but-capable specialist for classification, intent routing, and structured extraction at scale. Costs pennies; runs everywhere.

text compact cheap

Use cases

Shapes of work we route well.

A model name on a benchmark doesn't tell you which one will hold up under your traffic. Here's how teams compose the catalog in production.

01 · Agents & copilots

Long-running tool-using sessions.

Sonnet or GPT 5.5 for the loop; Opus for hard subtasks; an efficient specialist for the structured tool arguments. We expose a single conversation ID and stitch traces so you can audit every hop.

02 · RAG & search

Grounded answers with citations.

Command R+ as default; Gemini 3 Flash when context windows blow past common limits; Phi-4 for the embedding-adjacent classifier work. Streaming JSON keeps your search UI responsive even on multi-document grounding.

03 · Code generation

Editor-quality code across repos.

Claude Opus 4.7 or Gemini 3.1 Pro on the writing side, DeepSeek V4 on the verification side. We route diff-style requests to whichever pool has the lowest TTFT in your region — the user just sees fast.

04 · Customer-facing chat

Predictable cost per conversation.

Balanced tier as the default, with auto-failover into the efficient tier under saturation. Per-tenant rate budgets prevent one heavy user from starving the rest.

05 · Batch & pipelines

Cost-first asynchronous workloads.

Open-weights flagships with dedicated capacity and overnight-window scheduling. We can quote pricing per million completed jobs, not per token, when that's the shape your finance team prefers.

Integration

Drop-in. Not rip-and-replace.

The API mirrors the OpenAI REST schema, so existing SDKs and tooling work without modification — change the base URL and the API key, and you're in.

Python

from openai import OpenAI

client = OpenAI(
    base_url="https://api.idclinks.com/v1",
    api_key=os.environ["IDCLINKS_KEY"],
)

resp = client.chat.completions.create(
    model="claude-sonnet-4.6",
    messages=[{"role": "user", "content": "Hi"}],
    stream=True,
)

for chunk in resp:
    print(chunk.choices[0].delta.content, end="")

TypeScript

import OpenAI from "openai";

const client = new OpenAI({
  baseURL: "https://api.idclinks.com/v1",
  apiKey: process.env.IDCLINKS_KEY,
});

const stream = await client.chat.completions.create({
  model: "gpt-5.5",
  messages: [{ role: "user", content: "Hi" }],
  stream: true,
});

for await (const chunk of stream) {
  process.stdout.write(chunk.choices[0].delta.content ?? "");
}

A curated catalog of every model worth paying for.