How is the monthly cost calculated for each model?

The formula is: monthly = tasks × (input ÷ 1,000,000 × input price + output ÷ 1,000,000 × output price). You set the number of tasks per month, the input and output tokens per task, and the price per million tokens for each tier. All arithmetic runs in your browser on the numbers you enter.

Are the default model prices accurate?

No — the defaults are generic illustrative tiers (Frontier, Balanced, Economy) and are not tied to any specific provider's pricing. Model prices change frequently. Always enter the current price per million tokens from your provider's official pricing page before using the result to make a decision.

Which LLM is cheapest for coding?

It depends on your workload and the current prices of the models you are considering. Lower-priced models can be significantly cheaper for high-volume, repetitive tasks, but may require more turns to complete complex reasoning steps — which adds tokens and erases the saving. Enter your real token counts and current prices into the comparator to find out for your specific case.

Does merido automatically route to the cheapest model?

merido routes across the providers and API keys you own, by cost and latency, with automatic failover. It uses your own keys (BYOK), is self-hosted, and never pools or shares credentials. It does not promise a specific savings percentage — the right model for each request depends on your routing policy and task complexity.

LLM cost comparator — compare AI model pricing for your coding workload

Your coding workload

Input tokens per task 800k tok

Output tokens per task 40k tok

Tasks per month 60

Model tiers — prices editable, per 1M tokens

Input $ / 1M

Output $ / 1M

Input $ / 1M

Output $ / 1M

Input $ / 1M

Output $ / 1M

Generic tiers — model prices change often; enter your model's current price per million tokens from the provider's official pricing page.

Monthly cost comparison

Frontier

$—

Balanced

$—

Economy

$—

monthly difference between cheapest and most expensive

Formula: monthly = tasks × (input ÷ 1M × input price + output ÷ 1M × output price). All arithmetic is done in your browser on the numbers you enter. The cheapest tier is highlighted in teal.

How model price interacts with the cumulative-input tax

Switching from a frontier model to a cheaper one is an obvious lever — but it only captures part of the story. Agentic coding tools like Claude Code, Cursor, and Cline are stateless: every turn re-sends the entire conversation. That means a 60-turn session doesn't cost 60× a single turn; it costs roughly proportional to the sum of a growing series — much more than you'd expect. This is what we call the cumulative-input tax.

A cheaper model lowers the multiplier, but a longer session raises the base aggressively. Read more about this mechanism in why AI coding bills explode. The comparator above works with per-task token counts, so if you enter counts that already reflect a long session, the price difference between tiers will look large — because it is.

The right model depends on the task

A cheaper, smaller model can handle high-volume, repetitive coding steps (linting, simple refactors, test generation from templates) at dramatically lower cost. But the same model may require several extra turns to reason through a complex architectural decision — turns that each re-send the full context and erode the saving.

Practical rule of thumb

Use cheaper tiers for well-scoped, self-contained tasks where the output is easy to validate. Reserve frontier models for tasks that need strong reasoning in one pass, where extra turns are more expensive than the price difference between tiers.

See also the AI coding cost calculator for a session-level breakdown that shows how the cumulative-input tax compounds across turns at any price point.

How merido helps without hand-picking per request

Manually choosing a model for each task doesn't scale — and the comparator above shows why the tradeoff isn't always obvious. merido is an open-source, self-hosted AI gateway that sits between your coding CLI and the providers you already have accounts with:

Routes by cost and latency across the API keys and providers you own, automatically.
Fails over gracefully — if one provider is slow or rate-limited, the next account or provider picks up the request.
Compresses bulky tool output losslessly, reducing the re-send tax before it hits the wire.
BYOK, self-hosted — merido uses your own keys and never pools or resells them.
No promised savings percentage — it routes on real signals and reports what it actually observed.

Honest framing

merido does not tell you it will cut your bill by a headline percentage. The comparator above shows the arithmetic; whether the saving is real for you depends on which providers you have, your task mix, and whether quality is acceptable at the lower price point. merido measures what actually happened on your traffic and shows it — no fabricated baselines.

Route your real traffic, measure what moves

Open source, one self-hosted binary, on your own keys. Point your CLI at merido and let it route and compress automatically.

Get started → Read the docs

Related guides

AI coding cost calculator — session-level breakdown with the cumulative-input tax visualized.
Why AI coding bills explode — the mechanics behind the re-send tax.
How to reduce Claude Code costs — 8 tactics that attack the tax at its source.
Open-source AI gateway alternatives — self-hosted options for routing and cost control.

Frequently asked questions

How is the monthly cost calculated?

monthly = tasks × (input ÷ 1,000,000 × input price + output ÷ 1,000,000 × output price). You supply tasks per month, input and output tokens per task, and the price per million tokens for each tier. All computation happens in your browser.

Are the default prices real?

No — the defaults are generic illustrative tiers (Frontier, Balanced, Economy) with no specific vendor attached. Model prices change frequently. Enter the current per-million price from your provider's official pricing page for an accurate comparison.

Which model is cheapest for coding?

It depends on your workload and current prices. Cheaper models can save a lot on repetitive, well-scoped tasks, but may need extra turns on hard reasoning — which adds tokens and can erase the saving. Enter your real numbers to find out.

Does merido guarantee savings?

No. merido routes on real signals and reports what it actually observed. The comparator shows the arithmetic of publicly available pricing tiers; the actual saving depends on your task mix, provider availability, and acceptable quality at each tier.