Usage & the Advisor
merido tracks every request and costs it, then turns that history into concrete, actionable savings through the Token-Optimization Advisor — its headline differentiator.
Usage & cost tracking
Each request records its tokens (prompt, completion, cache-read) and an estimated cost, tagged by provider, model, and source. See it:
- CLI:
merido gain— requests, total tokens, estimated cost. - API:
GET /api/usage(summary),GET /api/reports(invoice-ready showback/chargeback rollups bycost_center/api_key(shown as the key's name) /user(the key owner) /source/model/day, as JSON, CSV, or Parquet). - Dashboard: live request feed and usage charts.
Chargeback works with zero client config. Every request is attributed to the gateway key it authenticated with and that key's owner (user) — provision one key per project/person and the Reports page breaks spend down by either, no headers required. Assign a cost_center to a key on the Gateway Keys page to roll several keys into one chargeback bucket.
Source (which tool) is detected automatically from the caller's User-Agent — Claude Code, Cursor, Codex, and common SDKs self-identify, so the "Spend by source" breakdown is populated out of the box. Send an explicit x-merido-source request header to override it with a project/tool label of your choosing.
The Token-Optimization Advisor
The advisor analyzes your recorded usage and recommends specific optimizations, each with an estimated monthly saving and a severity.
- CLI:
merido adviselists current recommendations;merido discoverdoes a retrospective scan — total spend, spend anomalies (days that spiked 3×+ over their trailing baseline), and the aggregate monthly savings still on the table. - API:
GET /api/advisor.
Typical things it detects: requests that could run on a cheaper capable model, missed prompt-cache opportunities, output verbosity that Caveman mode would cut, and repetitive prompts a cache would absorb.
Apply engine — with probation & auto-rollback
The advisor doesn't just advise; it can apply a recommendation for you, safely:
POST /api/advisor/applyapplies one action behind a probation window.- A background guard watches the action's baseline metrics during probation. If applying it makes things worse, it is auto-rolled-back — you don't get stuck with a bad change.
GET /api/advisor/appliedlists applied actions;POST /api/advisor/confirm/{id}promotes one past probation;POST /api/advisor/rollback/{id}reverts it manually.
This "recommend → apply with rollback" loop is what makes the advisor safe to act on rather than just read.
Related
- Token saving and Caching — the levers the advisor pulls.
- Virtual models & fallback — where cheaper-target recommendations land.