Implementing Rate Limiting in Chalice

Chalice (AWS's Python serverless micro-framework) does not offer built-in support for throttling/rate limiting. In this post I will show you how to implement your own custom rate limiting capabilities for your serverless API.

Rate Limiting Algorithms

There are several different known rate limiting algorithm out there with different pros and cons. For this post I settled for the Generic Cell Rate Algorithm (GCRA). Below is a basic implementation in Python using Redis:

# chalicelib/
from datetime import timedelta

from redis import Redis

class RateLimiter:
    """Manages rate limiting for a specific user across the API.

    Rate limiting is applied using Generic Cell Rate Algorithm (GCRA).
    GCRA works by tracking remaining limit through a time called
    Theoretical Arrival Time (TAT) on each request. Subsequent
    requests are limited if the arrival time is less than the
    stored TAT.


    def __init__(self, user_id: int, *, limit: int, period: timedelta):
        self.user_id = user_id
        self.limit = limit
        self.period = period
        self.period_in_seconds = int(period.total_seconds())

        self.redis_client = redis.Redis("your-redis-host-address")
        self.redis_key = f"throttling-{user_id}-{self.period_in_seconds}"

    def request_is_limited(self) -> bool:
        """Determines if an incoming request should be allowed for a user.

            bool: True if the user has exceeded the allowed limit, False
        time: int = self.redis_client.time()[0]

        separation = round(self.period_in_seconds / self.limit)
        self.redis_client.setnx(self.redis_key, 0)

            with self.redis_client.lock("lock:" + self.redis_key, blocking_timeout=5):
                redis_val = self.redis_client.get(self.redis_key)
                assert redis_val is not None

                tat: int = max(int(redis_val), time)

                if tat - time <= self.period_in_seconds - separation:
                    new_tat = max(tat, time) + separation
                    self.redis_client.set(self.redis_key, new_tat)
                    return False

                return True
        except redis.exceptions.LockError as ex:
            logger.warning("Redis lock error: %s" % ex)
            return True

This rate limiter class works based on a unique user ID that is usually persisted in a database and also located in each request after successful authentication with the API. You can adjust this based on your requirements.

Rate Limiting as Middleware

Let's take a look at how we can apply our rate limiter as middleware. This will mean that the specified rate limiting configuration will be applied to all requests to all endpoints.

Let's register our middleware:

from datetime import timedelta

from chalice import TooManyRequestsError

from chalicelib.throttling import RateLimiter


def rate_limiting_middleware(event, get_response):
    user_id = event.context.get("authorizer", {}).get("principalId")

    if user_id:
        limiter = RateLimiter(

        if limiter.request_is_limited():
            raise TooManyRequestsError(
                "You have exceeded the maximum number of requests"

    response = get_response(event)
    return response

This middleware will only work if there is a current user set in the request, meaning that the API client user has already authenticated with the API.

You can adjust the limit of requests as well as the time period in the limiter instance. You can even create more rate limiter instances to apply multiple rate limiting configurations (per hour, per day, etc.).

Rate Limiting as a Decorator

We can also implement a decorator approach where we manually select and decorate which routes we want to apply rate limiting on. To do this we will need a decorator that can take the Chalice route/blueprint reference as argument. Here is an example of how this would look like:

from chalice import Blueprint

my_routes = Blueprint(__name__)

@my_routes.route("/hello", methods=["GET"], authorizer=some_auth)
def index():
    return "foobar"

Here's how we can implement this decorator:

from datetime import timedelta
from functools import wraps

from chalice import TooManyRequestsError


def throttle(*, route: Blueprint):
    """Route decorator for applying
    throttling/rate limiting to endpoints.

    This decorator needs to be placed below the Chalice
    `route` registration decorator. The blueprint
    reference needs to be passed as an argument to
    the decorator.

    def decorator(func):
        def decorated_function(*args, **kwargs):
            request = route.current_request
            user_id = request.context.get("authorizer", {}).get("principalId")

            if user_id:
                limiter = RateLimiter(

                if limiter.request_is_limited():
                    raise TooManyRequestsError("You have exceeded the maximum number of requests")

            return func(*args, **kwargs)

        return decorated_function

    return decorator


python chalice serverless aws rate-limiting


comments powered by Disqus