Skip to content

Free token pool (merido/free)

The free token pool lets operators provision a shared set of free-tier provider accounts (Groq, Gemini, Cerebras, DeepSeek, …) and expose them to every user in the org under the single model name merido/free — no per-user provider setup required.

How it works

  1. The operator adds provider API keys to the pool (one per provider).
  2. merido automatically builds and maintains the merido/free virtual model — one unpinned, cost_optimized target per distinct provider using that provider's default model.
  3. Any user calls model: "merido/free" and gets served by the pool.
  4. A user who has their own org-level virtual model also named merido/free is served by their model instead — giving power users a clean override path.

Step 1 — Add provider keys to the pool

Add each provider API key via the dashboard /app/admin/free-pool or the HTTP API:

http
POST /api/system/pool/keys
Content-Type: application/json

{
  "provider": "groq",
  "api_key": "gsk_...",
  "label": "Groq free-tier"
}

Repeat for each provider you want in the pool (Gemini, Cerebras, DeepSeek, etc.). merido immediately reconciles merido/free: each distinct provider gets one unpinned, cost_optimized target pointing to that provider's default model. No manual virtual-model configuration is needed.

Note: Manually editing the merido/free virtual model via /api/system/virtual-models is possible for power users, but merido will overwrite those edits whenever pool membership changes. Treat the auto-managed VM as read-only.

Prevent runaway consumption with an environment variable:

MERIDO_FREE_POOL_GLOBAL_DAILY_TOKEN_CAP=500000

This is a UTC-day token budget shared across the entire pool. Once the counter reaches the cap, every subsequent request to merido/free receives HTTP 429 until the counter resets at midnight UTC. The cap is off (unlimited) when the variable is unset.

See Environment variables for this knob and MERIDO_FREE_POOL_VM.

User override

If a user's org already has a virtual model whose name matches merido/free, merido serves that model instead of the system one. This lets individual users substitute their own provider keys — e.g. a higher-rate paid Groq account — without disrupting other users.

Response headers

Every response routed through the pool includes transparency headers:

HeaderDescription
X-Routed-ViaThe provider/model that actually served the request (e.g. groq/llama-3.3-70b-versatile).
X-Fallback-AttemptsNumber of targets tried before a successful response. 0 means the first target served it.

Per-user token quota

In addition to the pool-wide daily cap, operators can set per-user token limits on rolling hourly, daily, and weekly windows:

VariableWindowDefault
MERIDO_FREE_TIER_TOKEN_LIMIT_HOURrolling 60 minutesunset (no per-user hourly cap)
MERIDO_FREE_TIER_TOKEN_LIMIT_DAYrolling 24 hoursunset (no per-user daily cap)
MERIDO_FREE_TIER_TOKEN_LIMIT_WEEKrolling 7 daysunset (no per-user weekly cap)

When unset, no per-user quota is enforced for that window; only the pool-wide MERIDO_FREE_POOL_GLOBAL_DAILY_TOKEN_CAP (Phase 1) applies.

Per-user = per-org (anti-sybil)

Quota is tracked at the org level. All API keys belonging to the same user share a single counter for each window. Creating additional API keys does not multiply a user's free allowance.

Note: Per-user quota only applies to requests that carry an org — i.e. org-scoped keys in hosted (multi-tenant) mode. A key with no org (legacy or system-wide keys, and all keys in single-user/local mode) is not subject to per-user windows; such requests are still bounded by the pool-wide global daily cap. Provision org-scoped keys for users you want metered per-user.

Enforcement: most-restrictive wins

Before serving a request to merido/free, merido checks every configured window. If the user's usage for any set window has reached or exceeded its limit, the request is rejected immediately with HTTP 429:

json
{
  "error": {
    "message": "Free tier quota reached. Wait for the window to reset, attach your own provider key, or upgrade.",
    "type": "rate_limit_error",
    "retry_after_ms": 1847000
  }
}

The retry_after_ms field (and the standard Retry-After response header, in seconds) point to the soonest window reset — the window that will free up capacity first. The client can use this to schedule an exact retry rather than back off blindly.

Interaction with the global daily cap

Per-user quotas and the pool-wide global cap are independent guards; both can trigger a 429. A user who has not exhausted their personal quota is still blocked for the remainder of the UTC day if the global cap for the entire pool is reached, and vice versa.

Single-user / local mode

Per-user quota enforcement requires a multi-tenant deployment where requests carry an org identity. In single-user local mode (no org attached to the request), per-user quota checks are skipped entirely. The pool-wide MERIDO_FREE_POOL_GLOBAL_DAILY_TOKEN_CAP remains the only token guard in that mode.

A balanced starting point for a hosted deployment:

MERIDO_FREE_TIER_TOKEN_LIMIT_HOUR=200000
MERIDO_FREE_TIER_TOKEN_LIMIT_DAY=1000000
MERIDO_FREE_TIER_TOKEN_LIMIT_WEEK=5000000

Tune these based on the number of users and the capacity of your provider pool. The hourly limit is the most effective guard against a single user monopolising the pool in a burst; the weekly limit provides a soft ceiling for sustained heavy users.

Onboarding (Phase 3a)

Verify-email gate

Set MERIDO_FREE_TIER_REQUIRES_VERIFIED_EMAIL=true (default) to require that the user's org has at least one verified email address before merido/free is accessible. When the gate fires, merido returns HTTP 403:

json
{
  "error": {
    "type": "email_unverified",
    "message": "Verify your email to use the free pool (merido/free)."
  }
}

The user must complete email verification (the link arrives at signup or via a re-send from the dashboard). Once verified, subsequent requests proceed normally.

Set MERIDO_FREE_TIER_REQUIRES_VERIFIED_EMAIL=false to disable the gate — useful for self-hosted deployments where you control user onboarding through other means.

This gate applies only in multi-tenant mode. In single-user local mode it is ignored.

Starter API key on signup

POST /api/auth/signup now returns a one-time api_key field alongside the normal session response:

json
{
  "token": "...",
  "api_key": "md-..."
}

This is an org-scoped proxy key pre-configured for merido/free. The user can paste it into their coding CLI immediately after signing up (and after verifying their email if the gate is on). The key is shown once and cannot be retrieved later — if it is lost, the user can create a new key from the dashboard API Keys page.

Quota endpoint

Authenticated users (session cookie or bearer token — dashboard context, not the proxy key) can query their current free-pool usage across all configured windows:

http
GET /api/free-pool/quota
Authorization: Bearer <session-token>

Response — multi-tenant, org present:

json
{
  "applicable": true,
  "hour":  { "limit": 200000, "used": 42000, "remaining": 158000, "resets_at": 1750000000000 },
  "day":   { "limit": 1000000, "used": 42000, "remaining": 958000, "resets_at": 1750000000000 },
  "week":  { "limit": 5000000, "used": 42000, "remaining": 4958000, "resets_at": 1750000000000 }
}

Each window object contains:

FieldTypeDescription
limitnumber | nullToken cap for the window (null if unset / unlimited).
usednumberTokens consumed in the current rolling window.
remainingnumber | nulllimit - used, or null if there is no limit.
resets_atnumberEpoch milliseconds at the window's fixed bucket boundary — (bucket + 1) × window_ms — when the counter rolls over to zero.

Response — single-user mode or no org:

json
{ "applicable": false }

Rich 429 body

When a per-user quota window is exceeded, merido returns HTTP 429 with a structured error body and a standard Retry-After header:

http
HTTP/1.1 429 Too Many Requests
Retry-After: 1847
json
{
  "error": {
    "type": "free_pool_quota_exceeded",
    "message": "Free tier quota reached. Wait for the window to reset, attach your own provider key, or upgrade.",
    "retry_after_ms": 1847000,
    "resets_at": 1750000000000
  }
}

retry_after_ms and resets_at point to the soonest window reset across all configured windows — so the client knows the earliest time it can retry successfully.

Operator console

Admins manage the free pool at /app/admin/free-pool (admin-only — non-admins receive HTTP 403).

The page shows a list of provider keys currently in the pool. The operator's only actions are:

  • Add a key — choose a provider, paste its API key, and optionally give it a label. merido immediately adds a target for that provider to merido/free (one target per distinct provider).
  • Remove a key — hard-deletes that connection. merido reconciles merido/free: if that was the last key for its provider, the target is removed. If the pool is now empty, the merido/free VM is removed entirely.

The console also displays the active limit knobs (MERIDO_FREE_TIER_*, MERIDO_FREE_POOL_GLOBAL_DAILY_TOKEN_CAP) read-only for observability. To change a limit, update the env var and restart the server.

There is no UI for configuring virtual models, targets, strategy, or account pinning — merido handles all of that automatically.

Management endpoints

All routes below require an admin session (session cookie or Authorization: Bearer <admin-token>). Non-admins receive HTTP 403.

http
GET    /api/system/pool                    # Pool state: keys list + active VM snapshot
POST   /api/system/pool/keys               # Add a provider key
                                           # Body: { "provider": "groq", "api_key": "gsk_...", "label"?: "..." }
DELETE /api/system/pool/keys/{id}          # Remove a provider key (hard-delete)

The lower-level /api/system/virtual-models and /api/system/accounts endpoints still exist for power users but are no longer used by the console.

Limit knobs are read-only in the UI. MERIDO_FREE_TIER_TOKEN_LIMIT_HOUR, MERIDO_FREE_TIER_TOKEN_LIMIT_DAY, MERIDO_FREE_TIER_TOKEN_LIMIT_WEEK, and MERIDO_FREE_POOL_GLOBAL_DAILY_TOKEN_CAP are set via environment variables and displayed in the console for observability. To change a limit, update the env var and restart the server.

  • Add providers & keys — add and manage provider accounts.
  • Virtual models & fallback — strategies, circuit breaker, and fallback chain.
  • Environment variablesMERIDO_FREE_POOL_VM, MERIDO_FREE_POOL_GLOBAL_DAILY_TOKEN_CAP, MERIDO_FREE_TIER_TOKEN_LIMIT_HOUR, MERIDO_FREE_TIER_TOKEN_LIMIT_DAY, MERIDO_FREE_TIER_TOKEN_LIMIT_WEEK, MERIDO_FREE_TIER_REQUIRES_VERIFIED_EMAIL.

MIT / Apache-2.0 licensed.