Skip to content

API endpoints

merido exposes two HTTP surfaces on the same port: the data plane (/v1/*, the LLM request path) and the control plane (/api/*, the dashboard/admin API). This page lists the user-relevant endpoints.

Authentication

  • /v1/* (data plane) — a client API key as Authorization: Bearer <key>. When MERIDO_REQUIRE_API_KEY=true, anonymous requests are rejected. Create keys with merido keys create or in the dashboard.
  • /api/* (control plane) — a dashboard session (POST /api/login → bearer JWT) or a management token. When no dashboard password is configured (local default), these are open.
  • /healthz, /metrics — unauthenticated.

Data plane (/v1/*)

EndpointMethodWire format / purpose
/v1/chat/completionsPOSTOpenAI Chat Completions (and Gemini generate-content).
/v1/messagesPOSTAnthropic Messages.
/v1/messages/count_tokensPOSTAnthropic token counting.
/v1/responsesPOSTOpenAI Responses API.
/v1/responses/compactPOSTCompact a Responses conversation.
/v1/embeddingsPOSTOpenAI embeddings.
/v1/rerankPOSTCohere/Jina/Voyage-style document reranking.
/v1beta/models/{model}:generateContentPOSTNative Gemini surface (:streamGenerateContent for SSE).
/v1/images/generationsPOSTImage generation.
/v1/audio/speechPOSTText-to-speech.
/v1/audio/transcriptionsPOSTSpeech-to-text (multipart upload, up to 64 MB).
/v1/modelsGETList available models (client key required). OpenAI-compatible objects (id, object, created, owned_by), enriched with context_window, max_output_tokens, mode, and a capabilities block when known.
/v1/model_group/infoGETRich per-model / per-virtual-model metadata: context window, output ceiling, mode, and supports_* capability flags. Virtual models aggregate across targets (largest context window, smallest output ceiling, union of capabilities — the group can serve a capability if any target supports it).
/v1/web/fetchPOSTServer-side web fetch.
/v1/searchPOSTSearch.
/v1/batchesPOST / GETSubmit (inline requests) / list async batch jobs.
/v1/batches/{id}GETGet a batch's status + results; /{id}/cancel (POST) cancels it.

Chat responses carry an x-merido-cache: hit|miss header (x-merido-cache-type: exact|semantic on a hit) so clients can observe the response cache.

All formats translate through one canonical representation, so a request in one dialect can be served by a model in another (or by a virtual model). The model field is a provider/model string or a virtual-model name.

/v1/* also accepts Helicone-compatible request headers, rewritten to merido's native equivalents.

Control plane (/api/*)

A selection of the most useful management endpoints (full CRUD shapes vary by resource):

Providers, keys, accounts

EndpointMethodsPurpose
/api/providersGET, POSTList / add upstream provider connections.
/api/providers/{id}DELETERemove a connection.
/api/keysGET, POSTList / create client (gateway) keys. Create accepts per-key scoping: allowed_models (model globs), rate_limit_rpm, rate_limit_tpm.
/api/keys/{id}DELETE, PATCHRevoke / update a key; /{id}/rotate (POST) rotates it.
/api/oauth/providersGETList OAuth-capable providers.
/api/oauth/accountsGETList connected OAuth accounts (/{id} DELETE to remove).
/api/registryGETThe provider registry.
/api/modelsGETSession-authed model discovery for the dashboard.
/api/models/infoGETSession-authed rich model/group metadata (the dashboard twin of /v1/model_group/info).
/api/model-catalogGETGrouped per-account model suggestions (each model badged with its context window + capabilities).

Virtual models

EndpointMethodsPurpose
/api/virtual-modelsGET, POSTList / create virtual models. Create/update return 409 if the name is already taken in the org, 400 on invalid strategy/targets.
/api/virtual-models/{id}GET, PUT, DELETERead / update / delete one.
/api/virtual-models/{id}/togglePOSTEnable / disable.
/api/virtual-models/{id}/previewGETRead-only routing preview: the order targets would be tried right now, with live cost/latency/quota signals and which targets are dropped (locked/quota-exhausted). Never advances the rotation cursor.
/api/virtual-models/reorderPOSTReorder.

Usage, savings, advisor

EndpointMethodsPurpose
/api/usageGETUsage summary.
/api/reportsGETInvoice-ready showback/chargeback rollups (JSON / CSV / Parquet).
/api/savingsGETSavings-ledger receipts (/totals, /rollup, /export).
/api/token-saver/filtersGETActive token-saver filters.
/api/advisorGETToken-Optimization Advisor recommendations.
/api/advisor/applyPOSTApply one action behind a probation window.
/api/advisor/appliedGETList applied actions.
/api/advisor/confirm/{id}POSTPromote an action past probation.
/api/advisor/rollback/{id}POSTRoll an applied action back.

Settings, budgets, policy, quota

EndpointMethodsPurpose
/api/settingsGET, PUTFeature toggles (guardrails, cache injection, semantic cache, …).
/api/budgetsGET, POSTBudgets (/{id} PUT/DELETE; /{id}/increase POST).
/api/policyGETCredential ToS policy (/{provider_id} PUT to set a mode).
/api/quotaGETProvider quota snapshots (/refresh POST).
/api/pricing/resolveGETResolved prices; /api/pricing/overrides to manage overrides.

Auth & session

EndpointMethodsPurpose
/api/loginPOSTDashboard login → session JWT.
/api/auth/signup, /login, /verify, /forgot, /resetPOST/GETSelf-serve auth (multi-tenant).
/api/auth/meGETCurrent session identity.
/api/eventsGETSSE stream of completed requests (live dashboard).

Health & metrics

EndpointMethodPurpose
/healthzGETStatus, profile, database + Redis health. Unauthenticated.
/health/livelinessGETLiveness probe — process is up (no dependency checks).
/health/readinessGETReadiness probe — 503 when the database is unreachable.
/metricsGETPrometheus exposition (core + OpenTelemetry GenAI metrics).

An unrouted /api/* or /v1/* path returns a JSON 404 (not the dashboard HTML), so clients get a clean error for not-yet-implemented endpoints.

MIT / Apache-2.0 licensed.