free AI model router

Pick the right model for each task. win.sh scores AI models by benchmark quality, coding strength, reasoning strength, speed, and price so your app stops overpaying for easy work.

Route a task Compare models

Are you an AI agent?

Start here: discovery, specs, model data, raw routes, and the skill file.

$ What is the cheapest useful model?

$ curl https://win.sh/router/cheapest

{
  "model": "ministral-3b-2512",
  "name": "Ministral 3 3B",
  "price": "$0.100/1M tokens",
  "why": "Ministral 3 3B is the cheapest useful model at $0.100/1M tokens."
}

Route by task

Pick the best model before you spend tokens.

Choose the task, latency target, budget, and quality floor. The AI model router sends each request to the cheapest model that can do the job, returns a fallback, and explains the choice so your app stops burning premium tokens on simple work.

TaskWrite code

Model regionAny region

LatencyAs fast as possible

BudgetSpend as little as possible

Quality floorHigh quality only

$ curl "https://win.sh/router?task=code&latency=fast&budget=low&quality=high"

Open API Raw query

Recommendation

DeepSeek: DeepSeek V4 Flash

DeepSeek: DeepSeek V4 Flash has the strongest useful intelligence per blended dollar today.

Price: $0.117
Intel: 72
Speed: 210/s

Coming soon

Intent Router

Soon you will send one plain task and skip the knobs. win.sh will classify the job, infer quality, latency, risk, and budget pressure, then route to the cheapest model that should succeed.

Instant answers

The common questions already have GET endpoints.

Which model gives the best value?

https://win.sh/router/best-valueJSON https://win.sh/router/best-value/rawRAW

DeepSeek: DeepSeek V4 Flash

DeepSeek: DeepSeek V4 Flash has the strongest useful intelligence per blended dollar today.

What is the cheapest useful model?

https://win.sh/router/cheapestJSON https://win.sh/router/cheapest/rawRAW

Ministral 3 3B

Ministral 3 3B has the lowest useful blended price at $0.1/1M tokens.

Which model answers fastest?

https://win.sh/router/fastestJSON https://win.sh/router/fastest/rawRAW

Ministral 3 3B

Ministral 3 3B has the highest estimated generation speed at 320 tokens/sec.

Which model scores highest?

https://win.sh/router/smartestJSON https://win.sh/router/smartest/rawRAW

OpenAI: GPT-5.5 Pro

OpenAI: GPT-5.5 Pro has the highest blended intelligence score in today's index.

Compact leaderboard

Models ranked for routing, not bragging rights.

Model	Best for	Intel	Code	Reason	$/1M	Speed	Value
DeepSeek: DeepSeek V4 FlashDeepseek	FastCheap	72	73	71	$0.117	210/s	1675.2
DeepSeek: DeepSeek V4 ProDeepseek	ReasonValue	88	89	88	$0.566	78/s	1590.1
Google: Gemini 3.1 Flash LiteGoogle	FastCheap	76	72	75	$0.625	168/s	518.4
Z.ai: GLM 5.2Z.ai	OpenValue	86	84	86	$1.56	92/s	503.2
OpenAI: GPT-5.4 NanoOpenAI	FastCheap	74	70	73	$0.515	154/s	497.1
Qwen: Qwen3.6 FlashQwen	FastCheap	73	70	72	$0.469	190/s	479.7
MoonshotAI: Kimi K2.7 CodeMoonshotai	CodeAgent	85	92	84	$1.57	74/s	464.9
Qwen: Qwen3.7 MaxQwen	ValueGeneral	87	86	87	$2.00	86/s	420.5
Mistral Medium 3.5Mistral	FastGeneral	82	80	82	$1.90	104/s	303.2
Google: Gemini 3.1 Pro PreviewGoogle	Long ctxReason	92	90	92	$5.00	64/s	231.2
Anthropic: Claude Sonnet 4.6Anthropic	CodeAgent	94	95	93	$6.60	52/s	196.4
Anthropic: Claude Opus 4.8Anthropic	FrontierReason	99	96	99	$11.00	28/s	152.8
OpenAI: GPT-5.5OpenAI	FrontierGeneral	96	94	96	$12.50	44/s	115.5
OpenAI: GPT-5.5 ProOpenAI	FrontierReason	100	97	100	$75.00	24/s	23.5
Ministral 3 3BMistral	Fastcheapest	58	52	56	$0.100	320/s	10

Method

Benchmarks set the floor. The router makes the tradeoff.

The index starts from benchmark-style scores for intelligence, coding, reasoning, latency, speed, and context size. It then applies a task policy so each request gets the cheapest model that still clears the quality bar.

Benchmark basis

Each verified model has an intelligence score plus separate coding and reasoning scores. Configured benchmark feeds can extend the seed table.

Value score

Prices are normalized into a blended dollars per million tokens number, then compared against the useful quality floor.

Task policy

Code favors coding score. Planning and hard analysis favor reasoning. Summaries and extraction favor cheap reliable execution.

Fallback built in

Every route returns a backup model so applications can retry without sending the same task back through guesswork.

How a recommendation is made

Index updated Jun 29, 2026.

1
Normalize model price, speed, context, and benchmark-style scores into one comparable table.
2
Calculate value as useful intelligence per blended dollar after filtering out weak models.
3
Apply the task, latency, budget, and quality settings from the request.
4
Return the top model, fallback model, policy, and plain-English reason.

FAQ

AI model router FAQ

What is an AI model router?

An AI model router chooses the best model for a task by weighing benchmark quality, coding or reasoning strength, speed, context, and token price.

Is the win.sh AI model router free?

Yes. The public GET endpoints for route recommendations, raw model ids, category winners, OpenAPI, llms.txt, and the model index are free to read.

How should an AI agent use this router?

Use /llms.txt for discovery, /openapi.json for the contract, /router/models for the full table, and /router/raw when the agent only needs a model id.

Can I limit routing to specific providers or regions?

Yes. Add providers=anthropic,openai or regions=us,eu,china to /router or /router/raw. The router only chooses from matching models and returns 400 if none match.

How are models scored?

The index keeps separate intelligence, coding, reasoning, speed, latency, context, and blended price signals. Task routing changes the weights before returning a model and fallback.

When should I use raw endpoints?

Use raw endpoints in scripts, CI jobs, or agents that want a plain text model id without parsing JSON.

Can I install this as an agent skill?

Yes. The public SKILL.md explains when to call the router, which endpoint to use, and how to validate the selected model.