API endpoints
merido exposes two HTTP surfaces on the same port: the data plane (/v1/*, the LLM request path) and the control plane (/api/*, the dashboard/admin API). This page lists the user-relevant endpoints.
Authentication
/v1/*(data plane) — a client API key asAuthorization: Bearer <key>. WhenMERIDO_REQUIRE_API_KEY=true, anonymous requests are rejected. Create keys withmerido keys createor in the dashboard./api/*(control plane) — a dashboard session (POST /api/login→ bearer JWT) or a management token. When no dashboard password is configured (local default), these are open./healthz,/metrics— unauthenticated.
Data plane (/v1/*)
| Endpoint | Method | Wire format / purpose |
|---|---|---|
/v1/chat/completions | POST | OpenAI Chat Completions (and Gemini generate-content). |
/v1/messages | POST | Anthropic Messages. |
/v1/messages/count_tokens | POST | Anthropic token counting. |
/v1/responses | POST | OpenAI Responses API. |
/v1/responses/compact | POST | Compact a Responses conversation. |
/v1/embeddings | POST | OpenAI embeddings. |
/v1/rerank | POST | Cohere/Jina/Voyage-style document reranking. |
/v1beta/models/{model}:generateContent | POST | Native Gemini surface (:streamGenerateContent for SSE). |
/v1/images/generations | POST | Image generation. |
/v1/audio/speech | POST | Text-to-speech. |
/v1/audio/transcriptions | POST | Speech-to-text (multipart upload, up to 64 MB). |
/v1/models | GET | List available models (client key required). OpenAI-compatible objects (id, object, created, owned_by), enriched with context_window, max_output_tokens, mode, and a capabilities block when known. |
/v1/model_group/info | GET | Rich per-model / per-virtual-model metadata: context window, output ceiling, mode, and supports_* capability flags. Virtual models aggregate across targets (largest context window, smallest output ceiling, union of capabilities — the group can serve a capability if any target supports it). |
/v1/web/fetch | POST | Server-side web fetch. |
/v1/search | POST | Search. |
/v1/batches | POST / GET | Submit (inline requests) / list async batch jobs. |
/v1/batches/{id} | GET | Get a batch's status + results; /{id}/cancel (POST) cancels it. |
Chat responses carry an x-merido-cache: hit|miss header (x-merido-cache-type: exact|semantic on a hit) so clients can observe the response cache.
All formats translate through one canonical representation, so a request in one dialect can be served by a model in another (or by a virtual model). The model field is a provider/model string or a virtual-model name.
/v1/* also accepts Helicone-compatible request headers, rewritten to merido's native equivalents.
Control plane (/api/*)
A selection of the most useful management endpoints (full CRUD shapes vary by resource):
Providers, keys, accounts
| Endpoint | Methods | Purpose |
|---|---|---|
/api/providers | GET, POST | List / add upstream provider connections. |
/api/providers/{id} | DELETE | Remove a connection. |
/api/keys | GET, POST | List / create client (gateway) keys. Create accepts per-key scoping: allowed_models (model globs), rate_limit_rpm, rate_limit_tpm. |
/api/keys/{id} | DELETE, PATCH | Revoke / update a key; /{id}/rotate (POST) rotates it. |
/api/oauth/providers | GET | List OAuth-capable providers. |
/api/oauth/accounts | GET | List connected OAuth accounts (/{id} DELETE to remove). |
/api/registry | GET | The provider registry. |
/api/models | GET | Session-authed model discovery for the dashboard. |
/api/models/info | GET | Session-authed rich model/group metadata (the dashboard twin of /v1/model_group/info). |
/api/model-catalog | GET | Grouped per-account model suggestions (each model badged with its context window + capabilities). |
Virtual models
| Endpoint | Methods | Purpose |
|---|---|---|
/api/virtual-models | GET, POST | List / create virtual models. Create/update return 409 if the name is already taken in the org, 400 on invalid strategy/targets. |
/api/virtual-models/{id} | GET, PUT, DELETE | Read / update / delete one. |
/api/virtual-models/{id}/toggle | POST | Enable / disable. |
/api/virtual-models/{id}/preview | GET | Read-only routing preview: the order targets would be tried right now, with live cost/latency/quota signals and which targets are dropped (locked/quota-exhausted). Never advances the rotation cursor. |
/api/virtual-models/reorder | POST | Reorder. |
Usage, savings, advisor
| Endpoint | Methods | Purpose |
|---|---|---|
/api/usage | GET | Usage summary. |
/api/reports | GET | Invoice-ready showback/chargeback rollups (JSON / CSV / Parquet). |
/api/savings | GET | Savings-ledger receipts (/totals, /rollup, /export). |
/api/token-saver/filters | GET | Active token-saver filters. |
/api/advisor | GET | Token-Optimization Advisor recommendations. |
/api/advisor/apply | POST | Apply one action behind a probation window. |
/api/advisor/applied | GET | List applied actions. |
/api/advisor/confirm/{id} | POST | Promote an action past probation. |
/api/advisor/rollback/{id} | POST | Roll an applied action back. |
Settings, budgets, policy, quota
| Endpoint | Methods | Purpose |
|---|---|---|
/api/settings | GET, PUT | Feature toggles (guardrails, cache injection, semantic cache, …). |
/api/budgets | GET, POST | Budgets (/{id} PUT/DELETE; /{id}/increase POST). |
/api/policy | GET | Credential ToS policy (/{provider_id} PUT to set a mode). |
/api/quota | GET | Provider quota snapshots (/refresh POST). |
/api/pricing/resolve | GET | Resolved prices; /api/pricing/overrides to manage overrides. |
Auth & session
| Endpoint | Methods | Purpose |
|---|---|---|
/api/login | POST | Dashboard login → session JWT. |
/api/auth/signup, /login, /verify, /forgot, /reset | POST/GET | Self-serve auth (multi-tenant). |
/api/auth/me | GET | Current session identity. |
/api/events | GET | SSE stream of completed requests (live dashboard). |
Health & metrics
| Endpoint | Method | Purpose |
|---|---|---|
/healthz | GET | Status, profile, database + Redis health. Unauthenticated. |
/health/liveliness | GET | Liveness probe — process is up (no dependency checks). |
/health/readiness | GET | Readiness probe — 503 when the database is unreachable. |
/metrics | GET | Prometheus exposition (core + OpenTelemetry GenAI metrics). |
An unrouted
/api/*or/v1/*path returns a JSON404(not the dashboard HTML), so clients get a clean error for not-yet-implemented endpoints.