<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"><channel><title>merido blog</title><description>Guides on controlling AI coding costs — the cumulative-input tax, tool-output compression, prompt caching, routing, and self-hosting your LLM gateway.</description><link>https://merido.dev</link><item><title>Claude Code vs Cursor: how their pricing really works (and how to cut both)</title><link>https://merido.dev/blog/claude-code-vs-cursor</link><guid isPermaLink="true">https://merido.dev/blog/claude-code-vs-cursor</guid><description>Claude Code vs Cursor — a fair comparison of interaction model, billing mechanism, and context behavior. The real cost driver is the same for both: agentic loops resend cumulative input every turn. Here&apos;s what actually differs, and how to cut the bill whichever you pick.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>How to reduce Claude Code costs: 8 ways that actually work</title><link>https://merido.dev/blog/reduce-claude-code-cost</link><guid isPermaLink="true">https://merido.dev/blog/reduce-claude-code-cost</guid><description>Claude Code bills explode because every turn resends the whole conversation. Here are 8 concrete ways to cut the cost — context hygiene, model routing, prompt caching, tool-output compression — and how merido automates them.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>How to reduce Cursor AI costs (without slowing down)</title><link>https://merido.dev/blog/reduce-cursor-cost</link><guid isPermaLink="true">https://merido.dev/blog/reduce-cursor-cost</guid><description>Hitting Cursor usage limits or a surprise bill? Cursor&apos;s cost is driven by cumulative input tokens — every turn resends the whole conversation. Here are concrete ways to cut it: context hygiene, model routing, prompt caching, tool-output compression — and how merido (BYOK, self-hosted) automates them.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Self-hosted LLM gateway: own your keys, your data, your routing</title><link>https://merido.dev/blog/self-hosted-llm-gateway</link><guid isPermaLink="true">https://merido.dev/blog/self-hosted-llm-gateway</guid><description>A self-hosted, open-source LLM gateway in Rust. Keep your API keys and data on your own infrastructure, route across 40+ providers with failover, compress tool output, and cap AI spend — BYOK, OpenAI-compatible, single static binary.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Prompt caching economics for coding agents: when it actually saves money</title><link>https://merido.dev/blog/prompt-caching-economics</link><guid isPermaLink="true">https://merido.dev/blog/prompt-caching-economics</guid><description>Prompt caching can cut input costs significantly — but only under specific conditions. Here is how cache economics work, when caching backfires, and how to know whether it is actually helping you.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Open-source AI gateway alternatives in 2026: LiteLLM, Portkey, Helicone, and merido</title><link>https://merido.dev/blog/open-source-ai-gateway-alternatives</link><guid isPermaLink="true">https://merido.dev/blog/open-source-ai-gateway-alternatives</guid><description>Fairly comparing the open-source and self-hostable AI gateway landscape in 2026 — LiteLLM, Portkey, Helicone, and merido — across BYOK, self-hosting, cost optimization depth, observability, license, and governance. Find the right fit for your stack.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>Why your AI coding bill explodes: the cumulative-input tax explained</title><link>https://merido.dev/blog/why-ai-coding-bills-explode</link><guid isPermaLink="true">https://merido.dev/blog/why-ai-coding-bills-explode</guid><description>Agentic coding costs grow super-linearly — not because models get pricier, but because every turn resends the whole conversation. Here is the mechanism, the math, and the levers that actually fix it.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>How to reduce Cline costs (without slowing your agent down)</title><link>https://merido.dev/blog/reduce-cline-cost</link><guid isPermaLink="true">https://merido.dev/blog/reduce-cline-cost</guid><description>Cline&apos;s API bill climbs because every agent turn resends the entire conversation — and Cline makes a lot of tool calls. Here are concrete ways to cut it: context hygiene, scoped reads, cheaper models for simple steps, prompt caching, terminal-output compression, budget caps, and routing across the keys you already own.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>How to monitor and control AI coding costs: a FinOps playbook for engineering teams</title><link>https://merido.dev/blog/monitor-ai-coding-costs</link><guid isPermaLink="true">https://merido.dev/blog/monitor-ai-coding-costs</guid><description>A practical FinOps playbook for engineering leads whose AI coding bill is growing and unpredictable. Covers visibility, caps, measured savings, and chargeback — with honest advice on what you can and cannot prove.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>GitHub Copilot agent costs: how to understand and control them</title><link>https://merido.dev/blog/reduce-copilot-cost</link><guid isPermaLink="true">https://merido.dev/blog/reduce-copilot-cost</guid><description>Copilot&apos;s agent mode runs the same agentic loop every other coding assistant does — cumulative context resent each turn, premium models billing per request. Here is where the money goes, what levers you actually have inside Copilot, and where an open gateway helps for the tools you can control.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item><item><title>How to reduce Aider costs (without losing repo context)</title><link>https://merido.dev/blog/reduce-aider-cost</link><guid isPermaLink="true">https://merido.dev/blog/reduce-aider-cost</guid><description>Aider&apos;s API bill climbs because it resends cumulative input every turn, and its repo map plus whole-file edits make those turns heavy. Here are concrete ways to cut it: keep your file set tight, start fresh per task, match the model to the job, use prompt caching strategically, compress command output, cap a budget, route across the provider keys you own, and measure before trusting a number.</description><pubDate>Sat, 20 Jun 2026 00:00:00 GMT</pubDate></item></channel></rss>