free AI model router

Pick the right model for each task. win.sh scores AI models by benchmark quality, coding strength, reasoning strength, speed, and price so your app stops overpaying for easy work.

Route a task Compare models

Are you an AI agent?

Start here: discovery, specs, model data, raw routes, and the skill file.

Route by task

Pick the best model before you spend tokens.

Choose the task, latency target, budget, and quality floor. The AI model router sends each request to the cheapest model that can do the job, returns a fallback, and explains the choice so your app stops burning premium tokens on simple work.
TaskWrite code
Model regionAny region
LatencyAs fast as possible
BudgetSpend as little as possible
Quality floorHigh quality only

Best model for write code?

Recommendation
DeepseekDeepSeek: DeepSeek V4 Flash

DeepSeek: DeepSeek V4 Flash has the strongest useful intelligence per blended dollar today.

Price
$0.117
Intel
72
Speed
210/s

Coming soon

Intent Router

Soon you will send one plain task and skip the knobs. win.sh will classify the job, infer quality, latency, risk, and budget pressure, then route to the cheapest model that should succeed.

Instant answers

The common questions already have GET endpoints.

Compact leaderboard

Models ranked for routing, not bragging rights.

ModelBest forIntelCodeReason$/1MSpeedValue
Deepseek
DeepSeek: DeepSeek V4 FlashDeepseek
FastCheap
727371$0.117210/s1675.2
Deepseek
DeepSeek: DeepSeek V4 ProDeepseek
ReasonValue
888988$0.56678/s1590.1
Google
Google: Gemini 3.1 Flash LiteGoogle
FastCheap
767275$0.625168/s518.4
Z.ai
Z.ai: GLM 5.2Z.ai
OpenValue
868486$1.5692/s503.2
OpenAI
OpenAI: GPT-5.4 NanoOpenAI
FastCheap
747073$0.515154/s497.1
Qwen
Qwen: Qwen3.6 FlashQwen
FastCheap
737072$0.469190/s479.7
Moonshotai
MoonshotAI: Kimi K2.7 CodeMoonshotai
CodeAgent
859284$1.5774/s464.9
Qwen
Qwen: Qwen3.7 MaxQwen
ValueGeneral
878687$2.0086/s420.5
Mistral
Mistral Medium 3.5Mistral
FastGeneral
828082$1.90104/s303.2
Google
Google: Gemini 3.1 Pro PreviewGoogle
Long ctxReason
929092$5.0064/s231.2
Anthropic
Anthropic: Claude Sonnet 4.6Anthropic
CodeAgent
949593$6.6052/s196.4
Anthropic
Anthropic: Claude Opus 4.8Anthropic
FrontierReason
999699$11.0028/s152.8
OpenAI
OpenAI: GPT-5.5OpenAI
FrontierGeneral
969496$12.5044/s115.5
OpenAI
OpenAI: GPT-5.5 ProOpenAI
FrontierReason
10097100$75.0024/s23.5
Mistral
Ministral 3 3BMistral
Fastcheapest
585256$0.100320/s10

Method

Benchmarks set the floor. The router makes the tradeoff.

The index starts from benchmark-style scores for intelligence, coding, reasoning, latency, speed, and context size. It then applies a task policy so each request gets the cheapest model that still clears the quality bar.

Benchmark basis

Each verified model has an intelligence score plus separate coding and reasoning scores. Configured benchmark feeds can extend the seed table.

Value score

Prices are normalized into a blended dollars per million tokens number, then compared against the useful quality floor.

Task policy

Code favors coding score. Planning and hard analysis favor reasoning. Summaries and extraction favor cheap reliable execution.

Fallback built in

Every route returns a backup model so applications can retry without sending the same task back through guesswork.

How a recommendation is made

Index updated Jun 29, 2026.

  1. 1

    Normalize model price, speed, context, and benchmark-style scores into one comparable table.

  2. 2

    Calculate value as useful intelligence per blended dollar after filtering out weak models.

  3. 3

    Apply the task, latency, budget, and quality settings from the request.

  4. 4

    Return the top model, fallback model, policy, and plain-English reason.

FAQ

AI model router FAQ

What is an AI model router?

An AI model router chooses the best model for a task by weighing benchmark quality, coding or reasoning strength, speed, context, and token price.

Is the win.sh AI model router free?

Yes. The public GET endpoints for route recommendations, raw model ids, category winners, OpenAPI, llms.txt, and the model index are free to read.

How should an AI agent use this router?

Use /llms.txt for discovery, /openapi.json for the contract, /router/models for the full table, and /router/raw when the agent only needs a model id.

Can I limit routing to specific providers or regions?

Yes. Add providers=anthropic,openai or regions=us,eu,china to /router or /router/raw. The router only chooses from matching models and returns 400 if none match.

How are models scored?

The index keeps separate intelligence, coding, reasoning, speed, latency, context, and blended price signals. Task routing changes the weights before returning a model and fallback.

When should I use raw endpoints?

Use raw endpoints in scripts, CI jobs, or agents that want a plain text model id without parsing JSON.

Can I install this as an agent skill?

Yes. The public SKILL.md explains when to call the router, which endpoint to use, and how to validate the selected model.