杰瑞科技汇

Python ratelimiter如何实现限流控制?

Of course! Here’s a comprehensive guide to rate limiting in Python, covering the concepts, popular libraries, and code examples.

Python ratelimiter如何实现限流控制?-图1
(图片来源网络,侵删)

What is Rate Limiting?

Rate limiting is a technique used to control the rate at which a user or client can make requests to a server or API. It's a crucial mechanism for:

  • Preventing Abuse: Stopping bots or malicious actors from overwhelming your service.
  • Ensuring Fairness: Guaranteeing that all users have equitable access to a resource.
  • Managing Load: Protecting your backend from being overloaded, which can cause slowdowns or crashes.
  • Cost Control: For third-party APIs, it helps you stay within your usage quota and avoid unexpected charges.

Core Concepts

Before diving into code, let's understand the main types of rate limiters:

  1. Fixed Window Counter: This is the simplest approach. You define a window (e.g., 1 minute) and a limit (e.g., 100 requests). The counter resets at the start of each new window.

    • Pros: Simple to implement.
    • Cons: "Burstiness" problem. If a user makes 100 requests in the last second of a window, they can immediately make another 100 in the first second of the next window.
  2. Sliding Window Log: This approach is more precise. For each request, you store its timestamp. When a new request comes in, you remove all timestamps older than your time window (e.g., 1 minute). You then check if the number of remaining timestamps is under the limit.

    Python ratelimiter如何实现限流控制?-图2
    (图片来源网络,侵删)
    • Pros: Prevents the burstiness problem.
    • Cons: Can be memory-intensive, as you need to store timestamps for every request in the current window.
  3. Sliding Window Counter (or Token Bucket): This is a popular hybrid model. Imagine a "bucket" that holds tokens. Tokens are added to the bucket at a fixed rate (e.g., 10 tokens per second). Each request consumes one token. If the bucket is empty, the request is denied.

    • Pros: Smooths out request rates and is very efficient. It allows for bursts of traffic (if the bucket has tokens) but enforces a long-term average rate.
    • Cons: Can be slightly more complex to implement from scratch.

Method 1: The Manual Approach (Token Bucket)

Understanding how to build a rate limiter yourself is a great learning experience. Here's a simple, thread-safe implementation of a Token Bucket limiter using time and threading.Lock.

import time
import threading
class TokenBucket:
    def __init__(self, rate, capacity):
        """
        Initializes the Token Bucket.
        :param rate: The rate at which tokens are added (tokens per second).
        :param capacity: The maximum number of tokens the bucket can hold.
        """
        self.rate = rate  # tokens per second
        self.capacity = capacity
        self.tokens = capacity
        self.last_refilled = time.time()
        self.lock = threading.Lock()
    def _refill(self):
        """Refills the bucket with tokens based on elapsed time."""
        now = time.time()
        time_passed = now - self.last_refilled
        tokens_to_add = time_passed * self.rate
        # Add tokens, but don't exceed capacity
        with self.lock:
            self.tokens = min(self.capacity, self.tokens + tokens_to_add)
            self.last_refilled = now
    def consume(self, tokens=1):
        """
        Consumes a number of tokens from the bucket.
        :param tokens: The number of tokens to consume.
        :return: True if tokens were consumed, False otherwise.
        """
        self._refill()
        with self.lock:
            if self.tokens >= tokens:
                self.tokens -= tokens
                return True
            return False
# --- Example Usage ---
if __name__ == "__main__":
    # Allow 10 requests per second, with a burst capacity of 10
    limiter = TokenBucket(rate=10, capacity=10)
    for i in range(15):
        if limiter.consume(1):
            print(f"Request {i+1}: Allowed")
        else:
            print(f"Request {i+1}: Rate Limited! Waiting...")
            # In a real app, you might wait and try again
            time.sleep(0.1) # Simulate waiting a bit
        time.sleep(0.05) # Simulate a small delay between requests

Method 2: Using Popular Libraries

For production code, it's almost always better to use a well-tested, feature-rich library.

ratelimit

A simple and popular decorator-based library.

Installation:

pip install ratelimit

Example:

from ratelimit import limits, sleep_and_retry
import time
# Define the rate limit: 10 calls per 10 seconds
CALLS_PER_10_SECONDS = 10
@sleep_and_retry
@limits(calls=CALLS_PER_10_SECONDS, period=10)
def limited_api_call():
    print("API call successful at:", time.time())
    return "data"
if __name__ == "__main__":
    print("Starting 15 rapid calls...")
    for i in range(15):
        limited_api_call()
        # The @sleep_and_retry decorator will automatically pause
        # if the limit is reached.

pyrate-limiter

A more powerful and flexible library that supports different rate-limiting algorithms (Fixed Window, Sliding Window, etc.) and can be used as a decorator or a direct class.

Installation:

pip install pyrate-limiter

Example (Decorator with Sliding Window):

from pyrate_limiter import Limiter, Duration, RequestRate
from pyrate_limiter.decorators import RateLimited
import time
# Define the rate limit: 10 calls per 10 seconds using a sliding window
limiter = Limiter(RequestRate(10, Duration.SECOND * 10))
@RateLimited(limiter)
def api_call():
    print("API call successful at:", time.time())
    return "data"
if __name__ == "__main__":
    print("Starting 15 rapid calls...")
    for i in range(15):
        try:
            api_call()
        except Exception as e:
            print(f"Call {i+1} was rate-limited.")
            # The library will raise an exception by default
            # You can handle it or configure the limiter to wait.

fastapi-limiter

If you are using the FastAPI web framework, this is the go-to library. It integrates seamlessly with FastAPI's dependency injection system.

Installation:

pip install fastapi-limiter

Example (FastAPI Application):

from fastapi import FastAPI, Request, HTTPException
from fastapi_limiter import FastAPILimiter
from fastapi_limiter.depends import RateLimiter
import uvicorn
import asyncio
app = FastAPI()
@app.on_event("startup")
async def startup():
    # Initialize the limiter. This will use an in-memory store.
    # For production, use Redis or a similar store.
    await FastAPILimiter.init(redis="redis://localhost")
@app.on_event("shutdown")
async def shutdown():
    await FastAPILimiter.close()
@app.get("/public")
def public_endpoint():
    return {"message": "This is a public endpoint, no rate limit."}
@app.get("/limited")
@RateLimiter(times=5, seconds=10) # Allow 5 calls per 10 seconds
async def limited_endpoint(request: Request):
    # The request object is needed by the dependency
    return {"message": "This is a rate-limited endpoint."}
if __name__ == "__main__":
    uvicorn.run("your_file_name:app", host="0.0.0.0", port=8000)

Method 3: Using a Dedicated Service (Redis)

For distributed systems (e.g., multiple servers or containers), an in-memory limiter won't work. You need a shared, centralized store. Redis is the perfect tool for this.

The redis library has a built-in command, EVAL, which allows you to run Lua scripts atomically. This is the most efficient way to implement a rate limiter in Redis.

Installation:

pip install redis

Concept: We'll use the ZADD (Sorted Set) and ZREMRANGEBYSCORE commands.

  • The Sorted Set will store request timestamps as members.
  • The score will be the timestamp itself.
  • We'll remove all timestamps older than our time window.
  • We'll check the number of remaining items in the set.

Python Code:

import redis
import time
# Connect to your Redis server
r = redis.Redis(host='localhost', port=6379, db=0)
def is_rate_allowed(key, limit, window_seconds):
    """
    Checks if a request is allowed based on a rate limit.
    :param key: A unique key for the user or client (e.g., user_id:ip).
    :param limit: The maximum number of allowed requests.
    :param window_seconds: The size of the time window in seconds.
    :return: True if allowed, False if rate limited.
    """
    # Lua script for atomic check-and-set
    lua_script = """
    local key = KEYS[1]
    local limit = tonumber(ARGV[1])
    local window = tonumber(ARGV[2])
    local now = tonumber(ARGV[3])
    local request_cost = tonumber(ARGV[4])
    -- Remove all older timestamps
    redis.call('ZREMRANGEBYSCORE', key, 0, now - window)
    -- Get the current count
    local current = redis.call('ZCARD', key)
    -- Check if we can add the new request
    if current < limit then
        redis.call('ZADD', key, now, now)
        redis.call('EXPIRE', key, window)
        return 1
    else
        return 0
    end
    """
    now = time.time()
    # Execute the script
    allowed = r.eval(lua_script, 1, key, limit, window_seconds, now, 1)
    return bool(allowed)
# --- Example Usage ---
USER_KEY = "user:123:rate_limit"
LIMIT = 5
WINDOW = 10 # seconds
for i in range(7):
    if is_rate_allowed(USER_KEY, LIMIT, WINDOW):
        print(f"Request {i+1}: Allowed")
    else:
        print(f"Request {i+1}: Rate Limited!")
    time.sleep(1)

Summary: Which One to Choose?

Method Best For Pros Cons
Manual (Token Bucket) Learning, simple scripts, single-threaded apps. Full control, no external dependencies. Complex to get right (thread-safety), not scalable.
ratelimit Simple scripts, functions, or methods. Very easy to use with decorators. Less flexible, not designed for distributed systems.
pyrate-limiter General purpose, more complex applications. Flexible algorithms, can be used as a class or decorator. Slightly more complex setup than ratelimit.
fastapi-limiter FastAPI applications. Seamless integration, production-ready (Redis support). Only for FastAPI.
Redis Distributed systems, microservices, high-traffic apps. Highly scalable, fast, atomic operations. Requires a Redis server, more complex setup.
分享:
扫描分享到社交APP
上一篇
下一篇