Cache stampede and subpar performance #520

demming · 2022-10-24T17:45:11Z

Expected Behavior

During load and soak testing several cached endpoints among the services that I've been running I've come across several instances of what is often referred to as cache stampede. Three issues in this repository were closed: #95, #107, #233. I've raised similar issues in the ASP.NET Core repo (mitigations are in as of most recent version 7), in the Play2 repo (no mitigations) and in the Quarkus repo (mitigated via lock).

Consider a simple microservice at localhost:8080 that only sanitizes HTML data from a given resource.

Running

bombardier -c 100 -d 10s -k -l "http://localhost:8080/website?address=http://localhost:8081

or against any other source of HTML data specified in the address query param, spawns 100 concurrent inbound connections.

Expectations:

Now on an initial run I expect them all to result in a cache miss but the cache be populated only once, not over 300 times. For some reason the methods are evaluated multiple times.
JVM should not crash (64m max heap results in thread starvation and OutOfMemory when all invocations begin populating the cache).
Cache should expire as defined in application.yml.
HTTP Cache-Control headers should be set automatically and correspond to the actual values.
The performance of the cache should be on par with Akka HTTP using Caffeine.

Baseline Akka HTTP latencies and throughput (-Xmx64m):

Statistics        Avg      Stdev        Max
  Reqs/sec      7981.90    1781.13   11763.68
  Latency       12.52ms    15.00ms   741.51ms
  Latency Distribution
     50%    11.03ms
     75%    14.60ms
     90%    19.13ms
     95%    22.78ms
     99%    37.76ms
  HTTP codes:
    1xx - 0, 2xx - 79891, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:     5.36GB/s

For more info just see my remarks in the other repos.

Actual Behaviour

- For 10 concurrent connections: between 16 and 19 method invocations take place;
- for 100 concurrent connections: over 300 hundred (JVM crashes then)
  (instead of just 1).
The cache does not expire (90s set in application.yml).
No corresponding Cache-Control headers are set automatically.
The performance is not bad but only a fraction of baseline.

Micronaut with -Xmx128m (64m just crashes due to stampede), slightly better figures without bounds

Statistics        Avg      Stdev        Max
  Reqs/sec      1381.17     625.48    2954.49
  Latency       72.14ms    75.33ms   786.00ms
  Latency Distribution
     50%    44.35ms
     75%    94.29ms
     90%   172.51ms
     95%   225.97ms
     99%   348.29ms
  HTTP codes:
    1xx - 0, 2xx - 13889, 3xx - 0, 4xx - 0, 5xx - 0
    others - 0
  Throughput:     0.93GB/s

Steps To Reproduce

Controller

@Slf4j
@CacheConfig("website-sanitizer-controller")
@Controller
public class WebsiteSanitizerController {

  private final WebsiteSanitizerService service;
  private int _controllerCount = 0;

  public
  WebsiteSanitizerController (WebsiteSanitizerService service) {this.service = service;}

  @Cacheable
  @Get("/website")
  public
  Mono <String> getSanitizedWebsite (final String address) {
    _controllerCount += 1;
    log.info(">>> Controller invocation #{}", _controllerCount);

    return service.sanitizeWebsite(address);
  }
}

application.yml

micronaut:
  caches:
    "website-sanitizer-controller":
      expire-after-write: 90s
      charset: 'UTF-8'
      maximum-size: 100

HttpClientService

@Slf4j
@Singleton
public class HttpClientService {
  private final HttpClient httpClient;
  private int _serviceInvocation = 0;

  public
  HttpClientService (HttpClient httpClient) {this.httpClient = httpClient;}

  public
  Mono<String> get (final String address) {
    _serviceInvocation += 1;
    log.info(">> HttpClientService.get invocation #{}", _serviceInvocation);

    var request = HttpRequest.GET(address);

    return Mono.from(httpClient.retrieve(request));
  }
}

WebsiteSanitizerService

@Slf4j
@Singleton
@CacheConfig("website-sanitizer-service")
public
class WebsiteSanitizerService {
  private final HttpClientService service;

  private final static PolicyFactory policy =
    Sanitizers.FORMATTING
      .and(Sanitizers.LINKS)
      .and(Sanitizers.TABLES)
      .and(Sanitizers.BLOCKS)
      .and(Sanitizers.IMAGES)
      .and(Sanitizers.STYLES);

  private int _serviceCounter = 0;


  public WebsiteSanitizerService (HttpClientService service) {this.service = service;}

  @CachePut(parameters = {"address"})
  public  Mono <String> sanitizeWebsite (String address) {
    _serviceCounter += 1;
    log.info(">> Service.sanitizeUrl invocation #{}", _serviceCounter);

    return service.get(address).map(html -> policy.sanitize(html));
  }
}

Environment Information

macOS 12.6
OpenJDK 19.0.1

Example Application

No response

Version

3.6.2

The text was updated successfully, but these errors were encountered:

graemerocher · 2022-10-25T06:52:47Z

regarding cache headers, the annotations are method level not HTTP layer level

demming · 2022-10-25T13:49:18Z

Thanks, I wasn't aware of it. Got used to rely on annotations for headers with other frameworks.

geoand mentioned this issue Oct 25, 2022

Subpar Caffeine cache performance (vs Akka HTTP, Play2, ASP.NET Core) quarkusio/quarkus#28795

Closed

demming mentioned this issue Oct 31, 2022

🐛 [Bug]: Cache stampede with caching middleware gofiber/fiber#2182

Open

3 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cache stampede and subpar performance #520

Cache stampede and subpar performance #520

demming commented Oct 24, 2022 •

edited

Loading

graemerocher commented Oct 25, 2022

demming commented Oct 25, 2022

Cache stampede and subpar performance #520

Cache stampede and subpar performance #520

Comments

demming commented Oct 24, 2022 • edited Loading

Expected Behavior

Expectations:

Actual Behaviour

Steps To Reproduce

Controller

application.yml

HttpClientService

WebsiteSanitizerService

Environment Information

Example Application

Version

graemerocher commented Oct 25, 2022

demming commented Oct 25, 2022

demming commented Oct 24, 2022 •

edited

Loading