4 Tips to Make Slow APIs Fast Enough

  4 mins read  

Very often we decide to reuse existing APIs to fulfill our new business cases. These APIs can be slow and do not match with the new use case. For example, if you need to expose them to greater audience or to open the data to BI. By slow I don’t mean that the APIs are badly designed, implemented or operated. On the opposite, the APIs may work perfectly for the required needs. But simply the API performance doesn’t fit for our new challenges.

Here are 4 tips how you can make the APIs faster without reimplementing them from scratch.


Cache usage is typical way of performance improvement. Also, the chipset one with lowest effort. It can be applied not only to API interfaces but to many levels of the application’s stack: controllers, services, methods … Simply said the purpose of the cache is to serve responses without recalculating them again.

Before starting the integration it’s important to check if you need to serve real time data. Real time data and cache hardly go together.
The hit rate is another thing to consider. When you apply the cache, assuming you already know your users behavior - how high will be the hit rate? Having too many random calls can cause low hit rate. In that case you are losing the benefits of a caching mechanism.

When you don’t have these or similar kind of constraints, a cache can speed up the APIs significantly. Main benefits are:

  • lower latency
  • less IO calls in background
  • faster response
  • less load on the backend
  • sometimes can also “cover” outage of the servers or the network

The cache can be implemented in different flavors:

  • using HTTP cache headers (depending on the API)
  • as application cache
  • with a distributed memory service like Redis and Memcached


Proxying of original API

With proxying of the API you’ll be able to:

  • scale better the original API. With the introduction of new endpoint(the proxy) the proxied service can be scaled horizontally. Of course, there is always some element which will be the bottleneck like the DB or the network, which can be tackled separately.
  • Establish custom caching. Some of the requests probably are easier to cache then others more demanding requirements. With the proxy you can enable caching on subset of the requests.
  • Split and redirect the requests. This is very useful to improve or implement functionality to serve only the parts that need better performance. You will also need good API contract which can support this kind of split. After the splitting you can have new API responsible only for part of the requests.
  • support of different versions in same time. So with the proxy is easier to establish and support the API versioning.

Services coupling

Coupling is not recommended practice as we all know it. This solution suggests exactly that. Still for many cases it worths to take a look and consider it. And in special cases it might make sense. For example when you have very chatty services and the latency is slowing you down.
Here is simplified diagram of the change:

Before coupling


After coupling

It’s actually movement of all B’s domain data and logic to the caller A and making it available in A’s runtime environment.


  • No IO calls
  • It’s fast


  • Coupling of the services
  • More expensive to maintain and extend
  • sometimes can be tricky to split/share responsibilities between the teams

Data Denormalization / NoSQL

When your database is the API bottleneck it’s logical to consider data denormalization or using a NoSQL database. Depending on the nature of the APIs, moving to another database can bring a lot of benefits. It takes a change of the mindset as well when you are thinking about the data. It may slow down the development team for a short time until it gets used to the new way of data manipulation.

The data denormalization should be planned to reduce the number of queries and to provide rich responses. Meaning if you are having for example a user profile page then with a single call (inside the API) you should be able to get not only the user’s first name, last name, profile pic but also wider set of info. If it’s a banking service then you would serve the current balance, last invoices … together with the user information. That means the storage should contain all this data stored in single document.

The change of the data storage can trigger additional changes depending on the API design:

  • Refactoring of the service. With poor persistence layer, probably you’ll need to calculate with more development effort.
  • Taking care for data updates. All updates that are coming need to be applied in a denormalized fashion. Synchronization with the true data is typical case.


At the end, I would like to notice that you are not bounded to a single way of doing it. Feel free to combine different approaches and involve your specifics. Most of the time that works best. You are also welcomed to share your way of APIs speed up, just drop a note below the post.

Subscribe for new posts