Debugging Apollo Cache

So you have implemented your API in GraphQL and deployed your Apollo server production. Congratulations! To make it production ready and scalable, you added caching to the server. And you find out your data is being stale for longer than you think! I feel I have to document this as a reference in the future as this was not immediately noticeable and hidden from from developers.

Problem

In the new architecture we are implementing on Article, it consists of a Node server for server side rendering with Vue and an Apollo server for GraphQL which talks to internal services and provide data for our Vue application. To ensure the performance on production, we added CloudFront and Redis caches for both servers. Everything works great except, some data are not updated even long after the TTL. Initially thought if it could be CDN cache so we invalidated it on both server side rendering server and Apollo server but still happened. Next we flushed out Redis cache on top of CDN cache invalidation since the data that SSR gets would be from Apollo Redis but it still happened. However if we fetch the data directly from the responsible service, we have the latest data. So something was happening Apollo that caused this. It seemed to happen only sometimes and the pattern initially appeared to be random until we found out that every time we made a new deployment the new data would be immediately served upon flushing Redis cache.

The Culprit

Most of our data have been migrated to new services and the communication between these services are on gRPC protocol. Some of the older APIs are still on the legacy server using the traditional RESTful API. Apollo has a very handy package called RESTDataSource detailed in the documentation here. This allows you to very easily send a HTTP request to the designated server without writing complicated fetch calls. To improve the performance and avoid duplicated requests, it also caches the promise / response internally so it doesn't hammer too much on the server. The duration of cache is based on the cache-control header from the response as mentioned here. So now we have CDN cache, Redis cache, and this internal memory cache. Normally it's fine but these caches can be out of sync from each other, meaning when Redis cache expires the internal cache is probably not cleared yet and Redis caches it again.

In your datasource that extends the class RESTDataSource, you can see if there are any cached response by console logging this.memoizedResults. Every time you send a request with the built in method, it check the TTL in that response and set an item in that memoizedResults map and it's a map of string-Promise pair. It may look something like this:

jsx
Map(1) {
  'https://website.com/api/test' => Promise {
    { success: true, page: [Object] }
  }
}

The best way for me is to make it not keep this in the internal cache as our schema is not complicated and doesn't have recursion that would be getting the same data over and over again. Seeing there is no way to turn it off, I clear it after each response:

jsx
// Add this in your data source file that extends RESTDataSource
didReceiveResponse(response, request) {
  this.memoizedResults.clear();
  return super.didReceiveResponse(response, request);
}

Now I have completely removed any uncertainties in where the data is and how long it will be kept. I personally think it's a good idea to have such cache in memory that prevents the server from pinging other services continuously but what suits us better is to have it more aligned with other caching system TTL so instead of following the source it follows however max age I set. Everything comes down to what your problem is and what you need.

If you find it useful, please tweet, share, and let me know if it helps you