Rate Limit - Basic¶

The Rate Limit - Basic policy enforces a simple request count limit on an LLM Provider or Proxy. Use it to cap the total number of requests allowed within a time window, regardless of token consumption.

Built-in Rate Limiting vs. Rate Limit - Basic Policy

The AI Workspace also has a first-class Rate Limiting tab on LLM Providers that provides request count and token count limits on the Backend scope (Per Consumer coming soon). The Rate Limit - Basic policy is a lightweight alternative that can be attached through the Guardrails interface when you need a simple per-route request cap.

Configuration Parameters¶

Parameter	Required	Description
Limits	Yes	A list of rate limit rules. Each rule specifies the maximum request count and the time window.
Limits[].Count	Yes	The maximum number of requests allowed within the duration.
Limits[].Duration	Yes	The time window for the limit (e.g., `60s`, `1m`, `1h`).

Add This Policy¶

Navigate to AI Workspace > LLM Providers or LLM Proxies.
Click on the provider or proxy name.
Go to the Guardrails tab.
Click + Add Guardrail and select Rate Limit - Basic from the sidebar.
Add one or more limit rules with a Count and Duration.
Click Add (for providers) or Submit (for proxies).
Deploy the provider or proxy to apply the changes.

Behavior¶

Incoming requests are counted per route or API.
When the request count exceeds the configured limit within the time window, the gateway returns HTTP 429 Too Many Requests.
The counter resets after the duration elapses.
When multiple limit rules are configured, the most restrictive limit is enforced.

Policies Overview
Policy Hub — Full policy specification and latest version

Rate Limit - Basic¶

Configuration Parameters¶

Add This Policy¶

Behavior¶

Related¶