Rate Limits

Balchemy rate limiting is layered. A 429, quota block, or degraded market-data result should identify the layer that owns it instead of collapsing every case into a single API limit.

The main layers are:

Layer	Typical key	Purpose	Failure behavior
HTTP ingress	route + IP	Generic public API abuse protection	Route-specific
MCP ingress	channel + IP	Protect unauthenticated MCP setup and transport cost	Bounded local fallback where safe
MCP principal	public ID + scope + key identity	Protect each agent credential and scope	Read may fall back; `trade` and `manage` fail closed
Research budget	principal + cost class + tool	Bound expensive broad discovery and provider fan-out	Expensive research fails closed
Provider quota	provider account + endpoint bucket	Respect upstream market, chain, and social provider contracts	Source is marked degraded/unavailable; `Retry-After` is honored where available
Provider circuit	provider + operation class	Stop repeated transient provider failures	Circuit opens, probes once, and recovers or remains degraded
WebSocket event	gateway/event + session/IP	Protect realtime event channels	Gateway policy

A provider limit is not an MCP-key limit. A research-budget block is not proof that market data does not exist. A single provider outage must not erase healthy source output from other providers.

Current Defaults

These are the current repository defaults, not a permanent capacity promise. Provider contracts can change by account, endpoint, credential class, and vendor policy.

Layer	Current default
Global HTTP with persisted settings	Effective `burstLimit=10` per 60 seconds by controller + handler + IP
Global HTTP fallback	100 requests per 15 minutes by controller + handler + IP
MCP HTTP ingress	600 requests per 60 seconds by channel + IP
MCP SSE ingress	120 requests per 60 seconds by channel + IP, plus authenticated SSE limits
MCP health ingress	60 requests per 60 seconds by channel + IP
MCP read scope	240 requests per 60 seconds by public ID + scope + key identity
MCP trade scope	120 requests per 60 seconds by public ID + scope + key identity; fails closed
MCP manage scope	30 requests per 60 seconds by public ID + scope + key identity; fails closed
Research cheap	240 class units and 180 per-tool units per hour
Research medium	90 class units and 60 per-tool units per hour
Research expensive	30 class units and 18 per-tool units per hour; store failure fails closed
WebSocket default	60 events per 60 seconds, with gateway decorators allowed to override
External provider local guard	60 requests per 60 seconds; RPC class 120 requests per 60 seconds

The global settings field named requestsPerMinute is not always the value directly enforced by the active HTTP guard. When burstLimit is present, the effective max/window comes from the active guard behavior.

MCP Scope Limits

MCP traffic has both transport and principal limits:

The MCP transport can limit unauthenticated or setup traffic before the request resolves to a specific agent.
After key resolution, the principal limit is keyed by public ID, scope, and key identity.
read, trade, and manage scopes are separate. trade and manage should fail closed when their limiter state is unavailable.

Use tools/list for the current key scope. Default read-safe tools are intentionally narrow, and granular MCP exposure is a read-only, non-raw allowlist. Raw provider tools, broad internal discovery, approvals, swaps, withdrawals, wallet mutation, and hidden backend tools are not exposed by a rate limit toggle.

Research Budgets

Broad discovery and enrichment are budgeted separately from transport limits. Examples include market briefs, candidate enrichment, social lookup, holder research, and risk evidence collection.

Research-budget responses should preserve the owning layer:

RESEARCH_BUDGET_EXHAUSTED means the configured broad-research budget is exhausted; direct asset lookup or cached shared snapshots may still work.
RESEARCH_BUDGET_UNAVAILABLE means budget state is unavailable. Cheap or medium paths may have bounded fallback behavior, while expensive research fails closed.
Shared market snapshots are collected once and reused. A user or LLM turn should not create an independent broad scan for every request.

If a shared market pool is empty, stale, unconfigured, or source-degraded, Balchemy should return sourceHealth and missingEvidence instead of implying that the market has no data.

Provider Limits And Degradation

External providers have their own account and endpoint contracts. Balchemy brokers, caches, normalizes, and audits upstream data; it does not originate all market data locally.

Provider responses should use stable machine-readable conditions:

Code	Meaning
`PROVIDER_RATE_LIMITED`	Upstream or local provider quota was reached.
`PROVIDER_FORBIDDEN`	The configured credential cannot access that endpoint.
`PROVIDER_UNAUTHORIZED`	Provider authentication failed.
`PROVIDER_UNCONFIGURED`	Required provider configuration is missing.
`PROVIDER_CIRCUIT_OPEN`	Local circuit paused calls after repeated failures.
`PROVIDER_TIMEOUT`	Provider did not respond before the deadline.
`PROVIDER_UPSTREAM_ERROR`	Provider returned a transient upstream error.

Provider output should include bounded source-health information such as source, retryable, retryAfterSeconds, degraded, freshness, and missing fields. It must not expose private URLs, credentials, quota headers, raw upstream body previews, or secret-bearing metadata.

Response Headers

When the HTTP ingress guard returns a strict 429, clients should expect standard rate-limit headers where available:

HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 10
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1782807000
Retry-After: 47
Content-Type: application/json

MCP tool responses and provider-backed reports may instead return a structured tool error or degraded source-health envelope. Do not assume every limit arrives as an HTTP 429.

Retry Guidance

One operation should have one retry owner.

Respect Retry-After when a provider or HTTP response supplies it.
Use bounded exponential backoff with jitter for transient network or 429 failures.
Do not retry 401, 403, invalid request, unsupported chain, unconfigured provider, or deterministic policy/budget blocks.
Do not automatically replay side-effecting trade, manage, approval, withdrawal, wallet, or swap operations.
SDK tools/call execution errors are not retried by default.

For read-only SDK calls:

import { withRetry } from "@balchemyai/agent-sdk";
 
const result = await withRetry(
  () => mcp.agentMarketBrief({ query: "Solana launch candidates", limit: 5 }),
  { maxAttempts: 3, baseDelayMs: 300, maxDelayMs: 10_000, jitter: true }
);

Best Practices

Use the layer-specific code. Treat MCP_SCOPE_RATE_LIMITED, RESEARCH_BUDGET_EXHAUSTED, and PROVIDER_RATE_LIMITED as different conditions with different remediation.

Cache read results. Market data and source-health reports are often useful for seconds to minutes. Avoid repeated identical cache misses inside one processing cycle.

Preserve source health. If GeckoTerminal is rate limited but DexScreener is healthy, show the surviving source and mark only the limited source degraded.

Throttle broad scans. Prefer shared market briefs and candidate reports over fan-out loops that call many granular tools for every token.

Fail closed for privileged paths. Missing limiter, replay, policy, or risk state must not authorize a trade/manage action.

Error Codes

Security

Edit this page on GitHub