API endpoints

merido exposes two HTTP surfaces on the same port: the data plane (/v1/*, the LLM request path) and the control plane (/api/*, the dashboard/admin API). This page lists the user-relevant endpoints.

Authentication

/v1/* (data plane) — a client API key as Authorization: Bearer <key>. When MERIDO_REQUIRE_API_KEY=true, anonymous requests are rejected. Create keys with merido keys create or in the dashboard.
/api/* (control plane) — a dashboard session (POST /api/login → bearer JWT) or a management token. When no dashboard password is configured (local default), these are open.
/healthz, /metrics — unauthenticated.

Data plane (`/v1/*`)

Endpoint	Method	Wire format / purpose
`/v1/chat/completions`	POST	OpenAI Chat Completions (and Gemini generate-content).
`/v1/messages`	POST	Anthropic Messages.
`/v1/messages/count_tokens`	POST	Anthropic token counting.
`/v1/responses`	POST	OpenAI Responses API.
`/v1/responses/compact`	POST	Compact a Responses conversation.
`/v1/embeddings`	POST	OpenAI embeddings.
`/v1/rerank`	POST	Cohere/Jina/Voyage-style document reranking.
`/v1beta/models/{model}:generateContent`	POST	Native Gemini surface (`:streamGenerateContent` for SSE).
`/v1/images/generations`	POST	Image generation.
`/v1/audio/speech`	POST	Text-to-speech.
`/v1/audio/transcriptions`	POST	Speech-to-text (multipart upload, up to 64 MB).
`/v1/models`	GET	List available models (client key required). OpenAI-compatible objects (`id`, `object`, `created`, `owned_by`), enriched with `context_window`, `max_output_tokens`, `mode`, and a `capabilities` block when known.
`/v1/model_group/info`	GET	Rich per-model / per-virtual-model metadata: context window, output ceiling, `mode`, and `supports_*` capability flags. Virtual models aggregate across targets (largest context window, smallest output ceiling, union of capabilities — the group can serve a capability if any target supports it).
`/v1/web/fetch`	POST	Server-side web fetch.
`/v1/search`	POST	Search.
`/v1/batches`	POST / GET	Submit (inline `requests`) / list async batch jobs.
`/v1/batches/{id}`	GET	Get a batch's status + results; `/{id}/cancel` (POST) cancels it.

Chat responses carry an x-merido-cache: hit|miss header (x-merido-cache-type: exact|semantic on a hit) so clients can observe the response cache.

All formats translate through one canonical representation, so a request in one dialect can be served by a model in another (or by a virtual model). The model field is a provider/model string or a virtual-model name.

/v1/* also accepts Helicone-compatible request headers, rewritten to merido's native equivalents.

Control plane (`/api/*`)

A selection of the most useful management endpoints (full CRUD shapes vary by resource):

Providers, keys, accounts

Endpoint	Methods	Purpose
`/api/providers`	GET, POST	List / add upstream provider connections.
`/api/providers/{id}`	DELETE	Remove a connection.
`/api/keys`	GET, POST	List / create client (gateway) keys. Create accepts per-key scoping: `allowed_models` (model globs), `rate_limit_rpm`, `rate_limit_tpm`.
`/api/keys/{id}`	DELETE, PATCH	Revoke / update a key; `/{id}/rotate` (POST) rotates it.
`/api/oauth/providers`	GET	List OAuth-capable providers.
`/api/oauth/accounts`	GET	List connected OAuth accounts (`/{id}` DELETE to remove).
`/api/registry`	GET	The provider registry.
`/api/models`	GET	Session-authed model discovery for the dashboard.
`/api/models/info`	GET	Session-authed rich model/group metadata (the dashboard twin of `/v1/model_group/info`).
`/api/model-catalog`	GET	Grouped per-account model suggestions (each model badged with its context window + capabilities).

Virtual models

Endpoint	Methods	Purpose
`/api/virtual-models`	GET, POST	List / create virtual models. Create/update return `409` if the name is already taken in the org, `400` on invalid strategy/targets.
`/api/virtual-models/{id}`	GET, PUT, DELETE	Read / update / delete one.
`/api/virtual-models/{id}/toggle`	POST	Enable / disable.
`/api/virtual-models/{id}/preview`	GET	Read-only routing preview: the order targets would be tried right now, with live cost/latency/quota signals and which targets are dropped (locked/quota-exhausted). Never advances the rotation cursor.
`/api/virtual-models/reorder`	POST	Reorder.

Usage, savings, advisor

Endpoint	Methods	Purpose
`/api/usage`	GET	Usage summary.
`/api/reports`	GET	Invoice-ready showback/chargeback rollups (JSON / CSV / Parquet).
`/api/savings`	GET	Savings-ledger receipts (`/totals`, `/rollup`, `/export`).
`/api/token-saver/filters`	GET	Active token-saver filters.
`/api/advisor`	GET	Token-Optimization Advisor recommendations.
`/api/advisor/apply`	POST	Apply one action behind a probation window.
`/api/advisor/applied`	GET	List applied actions.
`/api/advisor/confirm/{id}`	POST	Promote an action past probation.
`/api/advisor/rollback/{id}`	POST	Roll an applied action back.

Settings, budgets, policy, quota

Endpoint	Methods	Purpose
`/api/settings`	GET, PUT	Feature toggles (guardrails, cache injection, semantic cache, …).
`/api/budgets`	GET, POST	Budgets (`/{id}` PUT/DELETE; `/{id}/increase` POST).
`/api/policy`	GET	Credential ToS policy (`/{provider_id}` PUT to set a mode).
`/api/quota`	GET	Provider quota snapshots (`/refresh` POST).
`/api/pricing/resolve`	GET	Resolved prices; `/api/pricing/overrides` to manage overrides.

Auth & session

Endpoint	Methods	Purpose
`/api/login`	POST	Dashboard login → session JWT.
`/api/auth/signup`, `/login`, `/verify`, `/forgot`, `/reset`	POST/GET	Self-serve auth (multi-tenant).
`/api/auth/me`	GET	Current session identity.
`/api/events`	GET	SSE stream of completed requests (live dashboard).

Health & metrics

Endpoint	Method	Purpose
`/healthz`	GET	Status, profile, database + Redis health. Unauthenticated.
`/health/liveliness`	GET	Liveness probe — process is up (no dependency checks).
`/health/readiness`	GET	Readiness probe — `503` when the database is unreachable.
`/metrics`	GET	Prometheus exposition (core + OpenTelemetry GenAI metrics).

An unrouted /api/* or /v1/* path returns a JSON 404 (not the dashboard HTML), so clients get a clean error for not-yet-implemented endpoints.

API endpoints ​

Authentication ​

Data plane (/v1/*) ​

Control plane (/api/*) ​

Providers, keys, accounts ​

Virtual models ​

Usage, savings, advisor ​

Settings, budgets, policy, quota ​

Auth & session ​

Health & metrics ​