Rate Limiting in ASP.NET Core API and its types
A practical ASP.NET Core rate-limiting guide with architecture review, policy examples, and UML diagrams for Fixed Window, Sliding Window, Token Bucket, Concurrency Limiter, and Leaky Bucket.

Intro
Rate limiting protects your ASP.NET Core APIs from abuse, accidental traffic spikes, and expensive resource saturation. In production systems, it is not only a security control, it is also a reliability control.
This article reviews the five demo styles from your repository:
FixedWindowRateLimiterDemoSlidingWindowRateLimiterDemoTokenBucketRateLimiterDemoConcurrencyLimiterDemoLeakyBucketRateLimiterDemo
The goal is to understand how each limiter works, where it fits, and what tradeoffs you should expect in a real .NET Core API.
Why Rate Limiting Matters in ASP.NET Core APIs
Without rate limiting, one noisy client can degrade performance for all users. Typical failure patterns are:
- CPU spikes from repeated expensive endpoints
- connection pool exhaustion on DB-heavy calls
- brute-force attempts on login endpoints
- retry storms from unstable clients
Rate limiting enforces fairness and gives your backend controlled pressure handling.
Base Setup in ASP.NET Core
ASP.NET Core supports built-in rate limiting middleware via System.Threading.RateLimiting.
using System.Threading.RateLimiting;
var builder = WebApplication.CreateBuilder(args);
builder.Services.AddAuthentication("Bearer") .AddJwtBearer("Bearer", options => { // JWT configuration here });
builder.Services.AddRateLimiter(options =>{ options.RejectionStatusCode = StatusCodes.Status429TooManyRequests;
// A named policy that can be applied globally or per endpoint. options.AddPolicy("api-policy", context => { var userKey = context.User.Identity?.IsAuthenticated == true ? context.User.Identity!.Name ?? "unknown-user" : context.Connection.RemoteIpAddress?.ToString() ?? "anonymous";
return RateLimitPartition.GetFixedWindowLimiter( partitionKey: userKey, factory: _ => new FixedWindowRateLimiterOptions { PermitLimit = 10, Window = TimeSpan.FromSeconds(30), QueueProcessingOrder = QueueProcessingOrder.OldestFirst, QueueLimit = 0, AutoReplenishment = true }); });});
var app = builder.Build();
app.UseAuthentication();app.UseRateLimiter();app.UseAuthorization();
app.MapGet("/test", () => Results.Ok("allowed")) .RequireRateLimiting("api-policy") .RequireAuthorization();
app.Run();FixedWindowRateLimiterDemo
How Fixed Window Works
Fixed window divides time into equal windows, for example 10 requests per 30 seconds. Once the limit is hit, requests are rejected until the next window starts.
Best Use Cases for Fixed Window
- simple per-user or per-IP quotas
- predictable API monetization tiers
- low-overhead baseline protection
Important Tradeoff for Fixed Window
It can allow burst behavior near window boundaries, for example 10 requests at the end of one window and 10 more immediately after rollover.
UML Diagram for Fixed Window
classDiagram class Client class AspNetEndpoint class FixedWindowRateLimiter class TimeWindow class CounterStore class Response429
Client --> AspNetEndpoint : HTTP request AspNetEndpoint --> FixedWindowRateLimiter : Acquire(partition) FixedWindowRateLimiter --> TimeWindow : CurrentWindow() FixedWindowRateLimiter --> CounterStore : IncrementAndCheck() FixedWindowRateLimiter --> AspNetEndpoint : Allow or Reject AspNetEndpoint --> Response429 : On rejectMinimal Policy Example for Fixed Window
options.AddFixedWindowLimiter("fixed", limiterOptions =>{ limiterOptions.PermitLimit = 100; limiterOptions.Window = TimeSpan.FromMinutes(1); limiterOptions.QueueLimit = 0; limiterOptions.AutoReplenishment = true;});SlidingWindowRateLimiterDemo
How Sliding Window Works
Sliding window splits a window into segments. Instead of resetting all at once, it continuously recalculates allowed requests from recent segments, reducing boundary burst issues.
Best Use Cases for Sliding Window
- public APIs where fairness matters
- endpoints sensitive to sudden edge-window spikes
- systems that need smoother throughput than fixed window
Important Tradeoff for Sliding Window
Slightly more tracking overhead compared with fixed window.
UML Diagram for Sliding Window
classDiagram class Client class AspNetEndpoint class SlidingWindowRateLimiter class SegmentRingBuffer class RollingCounter class Response429
Client --> AspNetEndpoint : HTTP request AspNetEndpoint --> SlidingWindowRateLimiter : Acquire(partition) SlidingWindowRateLimiter --> SegmentRingBuffer : AdvanceSegment() SlidingWindowRateLimiter --> RollingCounter : ComputeActiveCount() SlidingWindowRateLimiter --> AspNetEndpoint : Allow or Reject AspNetEndpoint --> Response429 : On rejectMinimal Policy Example for Sliding Window
options.AddSlidingWindowLimiter("sliding", limiterOptions =>{ limiterOptions.PermitLimit = 100; limiterOptions.Window = TimeSpan.FromMinutes(1); limiterOptions.SegmentsPerWindow = 6; limiterOptions.QueueLimit = 0; limiterOptions.AutoReplenishment = true;});TokenBucketRateLimiterDemo
How Token Bucket Works
Token bucket refills tokens at a configured rate. Each request consumes one token. If tokens are available, request is allowed; otherwise rejected or queued.
Best Use Cases for Token Bucket
- APIs that must allow short bursts
- mobile or IoT clients with uneven traffic patterns
- internal APIs with occasional fan-out bursts
Important Tradeoff for Token Bucket
If refill rate and bucket size are too high, backend pressure can still become significant.
UML Diagram for Token Bucket
classDiagram class Client class AspNetEndpoint class TokenBucketRateLimiter class TokenPool class RefillScheduler class Response429
Client --> AspNetEndpoint : HTTP request AspNetEndpoint --> TokenBucketRateLimiter : Acquire(partition) TokenBucketRateLimiter --> TokenPool : ConsumeToken() RefillScheduler --> TokenPool : Replenish() TokenBucketRateLimiter --> AspNetEndpoint : Allow or Reject AspNetEndpoint --> Response429 : On rejectMinimal Policy Example for Token Bucket
options.AddTokenBucketLimiter("token", limiterOptions =>{ limiterOptions.TokenLimit = 200; limiterOptions.TokensPerPeriod = 20; limiterOptions.ReplenishmentPeriod = TimeSpan.FromSeconds(5); limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst; limiterOptions.QueueLimit = 0; limiterOptions.AutoReplenishment = true;});ConcurrencyLimiterDemo
How Concurrency Limiter Works
Concurrency limiter controls simultaneous in-flight requests, not requests per second. It protects CPU, thread pool, and downstream resources by capping active operations.
Best Use Cases for Concurrency Limiter
- expensive report generation
- DB-heavy write endpoints
- external API fan-out operations
Important Tradeoff for Concurrency Limiter
It does not directly control request rate over time. A fast sequence of short requests can still be high in total count.
UML Diagram for Concurrency Limiter
classDiagram class Client class AspNetEndpoint class ConcurrencyLimiter class ActiveLeasePool class RequestQueue class Response429
Client --> AspNetEndpoint : HTTP request AspNetEndpoint --> ConcurrencyLimiter : AcquireLease() ConcurrencyLimiter --> ActiveLeasePool : ReserveSlot() ConcurrencyLimiter --> RequestQueue : OptionalQueue() ConcurrencyLimiter --> AspNetEndpoint : Allow or Reject AspNetEndpoint --> Response429 : On rejectMinimal Policy Example
options.AddConcurrencyLimiter("concurrency", limiterOptions =>{ limiterOptions.PermitLimit = 20; limiterOptions.QueueLimit = 40; limiterOptions.QueueProcessingOrder = QueueProcessingOrder.OldestFirst;});LeakyBucketRateLimiterDemo
How Leaky Bucket Works
Leaky bucket enforces near-constant outflow rate. Incoming requests may be queued, then processed at a steady drain speed.
In practice, ASP.NET Core ships built-in fixed/sliding/token/concurrency limiters. Leaky bucket behavior is typically implemented as a custom queue + background drain pattern.
Best Use Cases for Leaky Bucket
- endpoints that must emit stable, smooth traffic to downstream systems
- integrations where burst traffic causes contractual or infrastructure issues
Important Tradeoff for Leaky Bucket
Higher request latency due to queueing. Queue sizing and timeout strategy are critical.
UML Diagram for Leaky Bucket
classDiagram class Client class AspNetEndpoint class InboundQueue class LeakyBucketWorker class ConstantDrainTimer class Response429
Client --> AspNetEndpoint : HTTP request AspNetEndpoint --> InboundQueue : Enqueue() AspNetEndpoint --> Response429 : Reject when queue full ConstantDrainTimer --> LeakyBucketWorker : Tick() LeakyBucketWorker --> InboundQueue : DequeueAtFixedRate()Conceptual Implementation Sketch
public sealed class LeakyBucketProcessor : BackgroundService{ private readonly Channel<Func<CancellationToken, Task>> _queue; private readonly TimeSpan _drainInterval = TimeSpan.FromMilliseconds(100);
public LeakyBucketProcessor(Channel<Func<CancellationToken, Task>> queue) => _queue = queue;
protected override async Task ExecuteAsync(CancellationToken stoppingToken) { using var timer = new PeriodicTimer(_drainInterval);
while (await timer.WaitForNextTickAsync(stoppingToken)) { if (_queue.Reader.TryRead(out var workItem)) { await workItem(stoppingToken); } } }}Side-by-Side Review
| Limiter | Strength | Limitation | Typical API Scenario |
|---|---|---|---|
| Fixed Window | Very simple and predictable | Boundary burst | Basic per-user quota |
| Sliding Window | Better fairness near boundaries | More bookkeeping | Public API fairness |
| Token Bucket | Burst-friendly plus long-term cap | Needs careful refill tuning | Bursty client traffic |
| Concurrency | Protects expensive in-flight workload | Not a time-rate limiter | CPU/DB intensive endpoints |
| Leaky Bucket | Smooth, constant processing rate | Added latency and queue management | Downstream systems needing steady flow |
Which One Should You Use
A pragmatic production pattern for .NET Core APIs:
- Use
FixedWindoworSlidingWindowfor login/auth endpoints. - Use
TokenBucketfor business endpoints that require burst tolerance. - Add
ConcurrencyLimiteron expensive operations. - Use
LeakyBucketpattern when downstream systems require smooth, constant request flow.
For authenticated APIs, partition by stable identity (for example user ID claim). For anonymous routes, partition by IP or fingerprint.
Final Takeaway
Rate limiting in ASP.NET Core is most effective when treated as layered traffic governance, not a single switch.
Use the built-in middleware as your first line of protection, then compose limiter strategies per endpoint risk profile. The five demos in RateLimitingSuite are a practical baseline for implementing that approach in real-world APIs.




