Usage & the Advisor

merido tracks every request and costs it, then turns that history into concrete, actionable savings through the Token-Optimization Advisor — its headline differentiator.

Usage & cost tracking

Each request records its tokens (prompt, completion, cache-read) and an estimated cost, tagged by provider, model, and source. See it:

CLI: merido gain — requests, total tokens, estimated cost.
API: GET /api/usage (summary), GET /api/reports (invoice-ready showback/chargeback rollups by cost_center / api_key (shown as the key's name) / user (the key owner) / source / model / day, as JSON, CSV, or Parquet).
Dashboard: live request feed and usage charts.

Chargeback works with zero client config. Every request is attributed to the gateway key it authenticated with and that key's owner (user) — provision one key per project/person and the Reports page breaks spend down by either, no headers required. Assign a cost_center to a key on the Gateway Keys page to roll several keys into one chargeback bucket.

Source (which tool) is detected automatically from the caller's User-Agent — Claude Code, Cursor, Codex, and common SDKs self-identify, so the "Spend by source" breakdown is populated out of the box. Send an explicit x-merido-source request header to override it with a project/tool label of your choosing.

The Token-Optimization Advisor

The advisor analyzes your recorded usage and recommends specific optimizations, each with an estimated monthly saving and a severity.

CLI: merido advise lists current recommendations; merido discover does a retrospective scan — total spend, spend anomalies (days that spiked 3×+ over their trailing baseline), and the aggregate monthly savings still on the table.
API: GET /api/advisor.

Typical things it detects: requests that could run on a cheaper capable model, missed prompt-cache opportunities, output verbosity that Caveman mode would cut, and repetitive prompts a cache would absorb.

Apply engine — with probation & auto-rollback

The advisor doesn't just advise; it can apply a recommendation for you, safely:

POST /api/advisor/apply applies one action behind a probation window.
A background guard watches the action's baseline metrics during probation. If applying it makes things worse, it is auto-rolled-back — you don't get stuck with a bad change.
GET /api/advisor/applied lists applied actions; POST /api/advisor/confirm/{id} promotes one past probation; POST /api/advisor/rollback/{id} reverts it manually.

This "recommend → apply with rollback" loop is what makes the advisor safe to act on rather than just read.

Token saving and Caching — the levers the advisor pulls.
Virtual models & fallback — where cheaper-target recommendations land.

Usage & the Advisor ​

Usage & cost tracking ​

The Token-Optimization Advisor ​

Apply engine — with probation & auto-rollback ​

Related ​

Usage & the Advisor

Usage & cost tracking

The Token-Optimization Advisor

Apply engine — with probation & auto-rollback

Related