Implementing Smart Rate Limiting in Spring Boot

A practical guide to building a system-aware rate limiter with Bucket4j

Nov 14, 2024

Introduction

Protecting your API from abuse is crucial. Rate limiting is key to your API security. It prevents denial-of-service attacks, manages resources, and ensures fair usage among clients. Spring Boot 3 and Bucket4j together provide a strong, flexible way to add rate limiting to your apps.

In this article, we will explore how to develop rate limiting using Bucket4j in a Spring Boot 3 application. We will cover different approaches and provide practical examples that you can adapt to your needs.

Prerequisites

Before start, ensure your have

Java 17 or higher.
Basic understanding of Java, Spring Boot, and API development.

Implementation

The first step is to add the required dependencies to your pom.xml or build.gradle.

<dependency>
    <groupId>com.bucket4j</groupId>
    <artifactId>bucket4j-core</artifactId>
    <version>8.3.0</version>
</dependency>
<dependency>
    <groupId>com.bucket4j</groupId>
    <artifactId>bucket4j-caffeine</artifactId>
    <version>8.3.0</version>
</dependency>
<dependency>
    <groupId>com.github.ben-manes.caffeine</groupId>
    <artifactId>caffeine</artifactId>
    <version>3.1.8</version>
</dependency>

We are not going to rush toward final code; instead, we will build rate limiting step by step. Let’s start by creating a basic REST controller.

package me.vrnsky.ratelimitingexamples.controller;

import org.springframework.web.bind.annotation.GetMapping;
import org.springframework.web.bind.annotation.RequestMapping;
import org.springframework.web.bind.annotation.RestController;

@RestController
@RequestMapping("/api")
public class RateLimitedController {

    @GetMapping("/greeting")
    public String getGreeting() {
        return "Hello, World!";
    }
}

Then we need to configure our rate limiting.

package me.vrnsky.ratelimitingexamples.config;

import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.Bucket;
import io.github.bucket4j.Refill;
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;

import java.time.Duration;

@Configuration
public class RateLimitConfig {

    @Bean
    public Bucket createNewBucket() {
        long overdraft = 50;
        Refill refill = Refill.intervally(40, Duration.ofMinutes(1));
        Bandwidth limit = Bandwidth.classic(overdraft, refill);
        return Bucket.builder()
                .addLimit(limit)
                .build();
    }

}

Now it’s time to set up a rate-limiting interceptor.

package me.vrnsky.ratelimitingexamples.interceptor;

import io.github.bucket4j.Bucket;
import io.github.bucket4j.ConsumptionProbe;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import lombok.RequiredArgsConstructor;
import org.springframework.http.HttpStatus;
import org.springframework.stereotype.Component;
import org.springframework.web.servlet.HandlerInterceptor;

@Component
@RequiredArgsConstructor
public class RateLimitInterceptor implements HandlerInterceptor {

    private final Bucket bucket;

    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
        ConsumptionProbe probe = bucket.tryConsumeAndReturnRemaining(1);
        if (probe.isConsumed()) {
            response.addHeader("X-Rate-Limit-Remaining", String.valueOf(probe.getRemainingTokens()));
            return true;
        }

        long waitForRefill = probe.getNanosToWaitForRefill() / 1_000_000_000;
        response.addHeader("X-Rate-Limit-Retry-After-Seconds", String.valueOf(waitForRefill));
        response.sendError(HttpStatus.TOO_MANY_REQUESTS.value(),
                "You have exhausted your API Request Quota");
        return false;
    }
}

At the moment, we have not registered our interceptor; let’s fix it.

package me.vrnsky.ratelimitingexamples.config;

import me.vrnsky.ratelimitingexamples.interceptor.RateLimitInterceptor;
import org.springframework.context.annotation.Configuration;
import org.springframework.web.servlet.config.annotation.InterceptorRegistry;
import org.springframework.web.servlet.config.annotation.WebMvcConfigurer;

@Configuration
public class WebMvcConfig implements WebMvcConfigurer {

    private final RateLimitInterceptor interceptor;

    public WebMvcConfig(RateLimitInterceptor interceptor) {
        this.interceptor = interceptor;
    }

    @Override
    public void addInterceptors(InterceptorRegistry registry) {
        registry.addInterceptor(interceptor)
                .addPathPatterns("/api/**");
    }
}

We implemented a basic version of a rate limiter. The basic version is not suitable for real-world scenarios.

IP-based limitations are closer to real-world scenarios. IP limitations enable more granular control.

package me.vrnsky.ratelimitingexamples.interceptor;

import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.Bucket;

import io.github.bucket4j.ConsumptionProbe;
import io.github.bucket4j.Refill;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import org.springframework.http.HttpStatus;
import org.springframework.stereotype.Component;
import org.springframework.web.servlet.HandlerInterceptor;

import java.time.Duration;
import java.util.concurrent.TimeUnit;

@Component
public class IpBasedRateLimitInterceptor implements HandlerInterceptor {

    private final Cache<String, Bucket> cache;

    public IpBasedRateLimitInterceptor() {
        this.cache = Caffeine.newBuilder()
                .expireAfterWrite(1, TimeUnit.SECONDS)
                .build();
    }

    @Override
    public boolean preHandle(HttpServletRequest request, HttpServletResponse response, Object handler) throws Exception {
        String ip = getClientIP(request);
        Bucket bucket = cache.get(ip, this::newBucket);

        ConsumptionProbe probe = bucket.tryConsumeAndReturnRemaining(1);
        if (probe.isConsumed()) {
            response.addHeader("X-Rate-Limit-Remaining", String.valueOf(probe.getRemainingTokens()));
            return true;
        }

        long waitForRefill = probe.getNanosToWaitForRefill() / 1_000_000_000;
        response.addHeader("X-Rate-Limit-Retry-After-Seconds", String.valueOf(waitForRefill));
        response.sendError(HttpStatus.TOO_MANY_REQUESTS.value(),
                "Rate limit exceeded. Try again in " + waitForRefill + " seconds");
        return false;
    }

    private String getClientIP(HttpServletRequest request) {
        String xfHeader = request.getHeader("X-Forwarded-For");
        if (xfHeader == null) {
            return request.getRemoteAddr();
        }
        return xfHeader.split(",")[0];
    }


    private Bucket newBucket(String ip) {
        return Bucket.builder()
                .addLimit(Bandwidth.classic(10, Refill.intervally(10, Duration.ofMinutes(1))))
                .build();
    }

}

Of course, we need unit tests to check that our implementation works.

package me.vrnsky.ratelimitingexamples;

import org.junit.jupiter.api.Test;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.test.context.SpringBootTest;
import org.springframework.boot.test.web.client.TestRestTemplate;
import org.springframework.boot.test.web.server.LocalServerPort;
import org.springframework.http.HttpStatus;
import org.springframework.http.ResponseEntity;

import static org.junit.jupiter.api.Assertions.assertEquals;

@SpringBootTest(webEnvironment = SpringBootTest.WebEnvironment.RANDOM_PORT)
class RateLimitedControllerTest {

    @LocalServerPort
    private int port;

    @Autowired
    private TestRestTemplate restTemplate;

    @Test
    void whenExceedingRateLimit_thenReceive429() {
        String url = "http://localhost:" + port + "/api/greeting";
        
        // Make 10 requests (exceeding our limit of 9)
        for (int i = 0; i < 10; i++) {
            ResponseEntity<String> response = restTemplate.getForEntity(url, String.class);
            
            if (i < 10) {
                assertEquals(HttpStatus.OK, response.getStatusCode());
            } else {
                assertEquals(HttpStatus.TOO_MANY_REQUESTS, response.getStatusCode());
            }
        }
    }
}

Last but not least, let’s build one more rate limiter. This rate limiter will limit requests based on the load of application.

package me.vrnsky.ratelimitingexamples.monitoring;

import lombok.extern.slf4j.Slf4j;
import org.springframework.stereotype.Component;

import java.lang.management.ManagementFactory;
import java.lang.management.OperatingSystemMXBean;

@Slf4j
@Component
public class SystemMetricsCollector {
    private final OperatingSystemMXBean osBean;

    public SystemMetricsCollector() {
        this.osBean = ManagementFactory.getOperatingSystemMXBean();
    }

    public SystemMetrics collectMetrics() {
        double cpuLoad = getProcessCpuLoad();
        long freeMemory = Runtime.getRuntime().freeMemory();
        long totalMemory = Runtime.getRuntime().totalMemory();
        double memoryUsage = 1.0 - (double) freeMemory / totalMemory;

        return new SystemMetrics(cpuLoad, memoryUsage);
    }

    private double getProcessCpuLoad() {
        if (osBean instanceof com.sun.management.OperatingSystemMXBean) {
            return ((com.sun.management.OperatingSystemMXBean) osBean)
                    .getProcessCpuLoad();
        }
        return osBean.getSystemLoadAverage();
    }
}

and

package me.vrnsky.ratelimitingexamples.monitoring;

import lombok.AllArgsConstructor;
import lombok.Data;

@Data
@AllArgsConstructor
public class SystemMetrics {
    private double cpuLoad;
    private double memoryUsage;
}

Then we need to create the calculation component of rate limiting.

@Component
@Slf4j
public class DynamicRateLimitCalculator {
    private static final int BASE_LIMIT = 100;
    private static final double CPU_THRESHOLD_HIGH = 0.8;
    private static final double CPU_THRESHOLD_MEDIUM = 0.5;
    private static final double MEMORY_THRESHOLD_HIGH = 0.8;
    private static final double MEMORY_THRESHOLD_MEDIUM = 0.5;

    public RateLimitConfig calculateLimit(SystemMetrics metrics) {
        int limit = BASE_LIMIT;
        
        // Apply CPU load factor
        limit = adjustLimitBasedOnCpu(limit, metrics.getCpuLoad());
        
        // Apply memory usage factor
        limit = adjustLimitBasedOnMemory(limit, metrics.getMemoryUsage());
        
        Duration refillDuration = calculateRefillDuration(metrics);
        
        log.debug("Calculated rate limit: {}/{}s", limit, 
                  refillDuration.getSeconds());
        
        return new RateLimitConfig(limit, refillDuration);
    }

    private int adjustLimitBasedOnCpu(int currentLimit, double cpuLoad) {
        if (cpuLoad > CPU_THRESHOLD_HIGH) {
            return (int) (currentLimit * 0.3); // Severe reduction
        } else if (cpuLoad > CPU_THRESHOLD_MEDIUM) {
            return (int) (currentLimit * 0.6); // Moderate reduction
        }
        return currentLimit;
    }

    private int adjustLimitBasedOnMemory(int currentLimit, 
                                       double memoryUsage) {
        if (memoryUsage > MEMORY_THRESHOLD_HIGH) {
            return (int) (currentLimit * 0.4);
        } else if (memoryUsage > MEMORY_THRESHOLD_MEDIUM) {
            return (int) (currentLimit * 0.7);
        }
        return currentLimit;
    }

    private Duration calculateRefillDuration(SystemMetrics metrics) {
        double maxLoad = Math.max(metrics.getCpuLoad(), 
                                metrics.getMemoryUsage());
        if (maxLoad > 0.8) {
            return Duration.ofMinutes(2);
        } else if (maxLoad > 0.5) {
            return Duration.ofMinutes(1);
        }
        return Duration.ofSeconds(30);
    }
}

@Data
@AllArgsConstructor
public class RateLimitConfig {
    private int limit;
    private Duration refillDuration;
}

Let’s create a flexible rate limiter that will act as a handler interceptor.

package me.vrnsky.ratelimitingexamples.interceptor;

import com.github.benmanes.caffeine.cache.Cache;
import com.github.benmanes.caffeine.cache.Caffeine;
import io.github.bucket4j.Bandwidth;
import io.github.bucket4j.Bucket;
import io.github.bucket4j.ConsumptionProbe;
import io.github.bucket4j.Refill;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;
import jakarta.annotation.PreDestroy;
import jakarta.servlet.http.HttpServletRequest;
import jakarta.servlet.http.HttpServletResponse;
import lombok.extern.slf4j.Slf4j;
import me.vrnsky.ratelimitingexamples.monitoring.DynamicRateLimitCalculator;
import me.vrnsky.ratelimitingexamples.monitoring.RateLimitConfig;
import me.vrnsky.ratelimitingexamples.monitoring.RateLimitConfigProvider;
import me.vrnsky.ratelimitingexamples.monitoring.RateLimitMetrics;
import me.vrnsky.ratelimitingexamples.monitoring.SystemMetrics;
import me.vrnsky.ratelimitingexamples.monitoring.SystemMetricsCollector;
import org.springframework.http.HttpStatus;
import org.springframework.http.MediaType;
import org.springframework.stereotype.Component;
import org.springframework.web.servlet.HandlerInterceptor;

import java.io.IOException;
import java.time.Duration;
import java.util.concurrent.Executors;
import java.util.concurrent.ScheduledExecutorService;
import java.util.concurrent.TimeUnit;
import java.util.concurrent.atomic.AtomicReference;

@Slf4j
@Component
public class DynamicRateLimitInterceptor implements HandlerInterceptor, RateLimitConfigProvider {
    private final Cache<String, Bucket> bucketCache;
    private final SystemMetricsCollector metricsCollector;
    private final DynamicRateLimitCalculator calculator;
    private final AtomicReference<RateLimitConfig> currentConfig;
    private final ScheduledExecutorService scheduler;
    private final RateLimitMetrics metrics;

    public DynamicRateLimitInterceptor(SystemMetricsCollector metricsCollector,
                              DynamicRateLimitCalculator calculator, MeterRegistry meterRegistry) {
        this.metricsCollector = metricsCollector;
        this.calculator = calculator;
        this.currentConfig = new AtomicReference<>(
                new RateLimitConfig(100, Duration.ofMinutes(1))
        );
        this.bucketCache = Caffeine.newBuilder()
                .expireAfterWrite(1, TimeUnit.HOURS)
                .build();
        this.scheduler = Executors.newSingleThreadScheduledExecutor();
        this.metrics = new RateLimitMetrics(meterRegistry, this);
        startMetricsUpdateTask();
    }

    private void startMetricsUpdateTask() {
        scheduler.scheduleAtFixedRate(
                this::updateRateLimitConfig,
                0,
                10,
                TimeUnit.SECONDS
        );
    }

    private void updateRateLimitConfig() {
        try {
            SystemMetrics metrics = metricsCollector.collectMetrics();
            RateLimitConfig newConfig = calculator.calculateLimit(metrics);

            RateLimitConfig oldConfig = currentConfig.get();
            if (hasSignificantChange(oldConfig, newConfig)) {
                currentConfig.set(newConfig);
                log.info("Rate limit updated: {}/{}s",
                        newConfig.getLimit(),
                        newConfig.getRefillDuration().getSeconds());

                // Clear cache to force bucket recreation with new limits
                bucketCache.invalidateAll();
            }
        } catch (Exception e) {
            log.error("Error updating rate limit config", e);
        }
    }

    private boolean hasSignificantChange(RateLimitConfig oldConfig,
                                         RateLimitConfig newConfig) {
        double limitChange = Math.abs(1.0 -
                (double) newConfig.getLimit() / oldConfig.getLimit());
        return limitChange > 0.2; // 20% change threshold
    }

    public RateLimitConfig getRateLimitConfig() {
        return this.currentConfig.get();
    }

    @Override
    public boolean preHandle(HttpServletRequest request,
                             HttpServletResponse response,
                             Object handler) throws Exception {
        String path = request.getRequestURI();
        String method = request.getMethod();

        Timer.Sample timerSample = metrics.startTimer();
        boolean rateLimited = false;
        try {
            metrics.recordRequest();

            String clientId = getClientIdentifier(request);
            Bucket bucket = bucketCache.get(clientId, this::createBucket);

            ConsumptionProbe probe = bucket.tryConsumeAndReturnRemaining(1);

            if (probe.isConsumed()) {
                addRateLimitHeaders(response, probe);
                return true;
            }

            metrics.incrementRateLimitExceeded();
            handleRateLimitExceeded(response, probe);
            return false;
        } finally {
            metrics.stopTimer(timerSample, path, method, rateLimited);
        }
    }

    private Bucket createBucket(String clientId) {
        RateLimitConfig config = currentConfig.get();
        return Bucket.builder()
                .addLimit(Bandwidth.classic(
                        config.getLimit(),
                        Refill.intervally(config.getLimit(),
                                config.getRefillDuration())
                ))
                .build();
    }

    private String getClientIdentifier(HttpServletRequest request) {
        // Could combine multiple factors: IP, user ID, API key, etc.
        return request.getRemoteAddr();
    }

    private void addRateLimitHeaders(HttpServletResponse response,
                                     ConsumptionProbe probe) {
        RateLimitConfig config = currentConfig.get();
        response.addHeader("X-Rate-Limit-Limit",
                String.valueOf(config.getLimit()));
        response.addHeader("X-Rate-Limit-Remaining",
                String.valueOf(probe.getRemainingTokens()));
        response.addHeader("X-Rate-Limit-Reset",
                String.valueOf(probe.getNanosToWaitForRefill() /
                        1_000_000_000));
    }

    private void handleRateLimitExceeded(HttpServletResponse response,
                                         ConsumptionProbe probe)
            throws IOException {
        response.setStatus(HttpStatus.TOO_MANY_REQUESTS.value());
        response.setContentType(MediaType.APPLICATION_JSON_VALUE);

        String errorMessage = String.format(
                "Rate limit exceeded. Try again in %d seconds",
                probe.getNanosToWaitForRefill() / 1_000_000_000
        );

        response.getWriter().write(
                String.format(
                        "{\"error\": \"%s\", \"retryAfter\": %d}",
                        errorMessage,
                        probe.getNanosToWaitForRefill() / 1_000_000_000
                )
        );
    }

    @PreDestroy
    public void shutdown() {
        scheduler.shutdown();
    }
}

Now we need to configure our Spring Boot application to use it.

@Configuration
public class RateLimitConfig implements WebMvcConfigurer {
    
    @Autowired
    private DynamicRateLimiter rateLimiter;
    
    @Override
    public void addInterceptors(InterceptorRegistry registry) {
        registry.addInterceptor(rateLimiter)
               .addPathPatterns("/api/**");
    }
}

It is always a good idea to track performance, load, memory consumption, and other aspects of your app. Let’s create custom metrics.

package me.vrnsky.ratelimitingexamples.monitoring;

import io.micrometer.core.instrument.Counter;
import io.micrometer.core.instrument.Gauge;
import io.micrometer.core.instrument.MeterRegistry;
import io.micrometer.core.instrument.Timer;

import java.util.Map;


public class RateLimitMetrics {
    private final MeterRegistry meterRegistry;
    private final Counter rateLimitExceeded;
    private final Counter requestsTotal;
    private final Gauge currentLimit;

    public RateLimitMetrics(MeterRegistry registry,
                            RateLimitConfigProvider configProvider) {
        this.meterRegistry = registry;

        this.rateLimitExceeded = Counter.builder("rate_limit.exceeded")
                .description("Number of rate limit exceeded events")
                .tag("type", "exceeded")
                .register(registry);

        this.requestsTotal = Counter.builder("rate_limit.requests")
                .description("Total number of requests processed")
                .tag("type", "total")
                .register(registry);

        this.currentLimit = Gauge.builder("rate_limit.current",
                        configProvider,
                        this::getCurrentLimit)
                .description("Current rate limit value")
                .tag("type", "limit")
                .register(registry);
    }

    public Timer.Sample startTimer() {
        return Timer.start();
    }

    public void stopTimer(Timer.Sample sample, String path, String method, boolean rateLimited) {
        Timer timer = Timer.builder("rate_limit.request.duration")
                .description("Request duration through rate limiter")
                .tags(
                        "path", path,
                        "method", method,
                        "rate_limited", String.valueOf(rateLimited),
                        "component", "rate_limiter"
                )
                .register(meterRegistry);
        sample.stop(timer);
    }

    public void incrementRateLimitExceeded() {
        rateLimitExceeded.increment();
    }

    public void recordRequest() {
        requestsTotal.increment();
    }

    private double getCurrentLimit(RateLimitConfigProvider provider) {
        return provider.getRateLimitConfig().getLimit();
    }

    public Map<String, Number> getCurrentMetrics() {
        return Map.of(
                "rateLimitExceeded", rateLimitExceeded.count(),
                "totalRequests", requestsTotal.count(),
                "currentLimit", currentLimit.value()
        );
    }
}

Best practices and considerations

Cache Implementation: For production, use distributed caches like Redis for rate limiting in clustered environments.
Headers: Always include rate limit information in response headers. It helps clients manage their request rates.
1. X-Rate-Limit-Remaining
2. X-Rate-Limit-Retry-After-Seconds
Error Handling: Provide clear error messages when users exceed rate limits.
Monitoring: Put in place metrics to track rate-limiting events and adjust limits based on usage patterns.

Conclusion

This article shows a simple way to use Bucket4j in your Spring Boot 3 app to create rate limiting. We have covered three approaches: based on time, based on IP addresses, and based on system load. Real-world cases can differ from those in this article.

The rate limiting is one part of a comprehensive API security strategy. By combining it with other security measures, you can build strong API protection.

Egor’s Substack

Discussion about this post