Cache Strategies For Developers

Şerifhan Işıklı
8 min readNov 1, 2024

--

This paper aims to identify the five most important caching strategies and discuss them with professional approaches.

Caching is an efficient method in the reduction of latency and improvement of the system. It becomes crucial, especially in systems that work with big data flows or dealing with a large number of requests.

The strategies are thus varied depending on the needs of the system. For example, if a system is read intensive, then the optimization of read operations should be the priority while for write intensive systems and operations should be the priority. Also, data consistency requirements are some of the key factors that determine strategy choice in the choice of strategy.

Here, we will discuss five basic caching approaches that are usually discussed in the system design, and some of the professional advices based on the practical implementation.

1. Read Through Caching

read-through-caching

Description: The Read Through strategy, the application first checks the cache of the memory which stores the information. In the case when the data is found in cache (cache hit), then the result is passed to the application. When these values are not available, in other words, there is cache miss, the cache will go to the database get the data and store it and then pass it to the application. This kind of strategy make application logic easier because it is the responsibility of the cache to read or write data to the database.

Professional Insight: This strategy is very useful in CDN implementation and in social media applications. For example, in CDN for social networking where so many users hit the system from different regions requesting the same media content, the Read Through approach can be very effective in reducing the load on the server and delivering the content as soon as possible. However, there is some heavy lifting to be done to get nearer to this dream and, specifically, it is crucial to set the TTL (Time to Live) correctly in order not to have old data. Perhaps in some elaborate system architectures the management of TTL as dynamic could be florid. AWS Elasticache and Redis are used to successfully apply this common strategy, provided below.

Code Explain :

function readThrough(cache, database, key) {
if (cache[key]) {
console.log(`Cache hit: ${key} -> ${cache[key]}`);
return cache[key];
} else {
console.log(`Cache miss. Fetching from database: ${key}`);
if (database[key]) {
cache[key] = database[key]; // Load data into cache
return cache[key];
} else {
console.log(`Key not found in database: ${key}`);
return null;
}
}
}

// Test Read Through
console.log("Read Through Test:");
readThrough(cache, database, 'user_1'); // Cache miss, fetch from DB
readThrough(cache, database, 'user_1'); // Cache hit

Steps:

Check Cache: Look for data in the cache.

Fetch from Database on Miss: If the cache does not exist about the data, then it is obtained from DB (that is cache miss scenario).

Store in Cache: Then, store the data in cache so that they can be retrieved later from the cache rather than the data base.

Best Practices:

TTL (Time To Live): Cache entries should be set with a TTL that will automatically expire it in case and thus do not present right information at wrong time.

Minimize Cache Lookups: The cache look up process should not be time consuming since this would put a lot of overhead.

2. Cache Aside ( Lazy Loading )

Description: Cache Aside, also referred to as “Lazy Loading,” is a strategy whereby the application is responsible for loading the data into the cache at a later date. The application first checks the cache and if the data is not there (cache miss) it goes to the database and brings the data into the cache as well.

Professional Insight: I have applied this strategy often in e-commerce applications where product specifications and prices are read many times more than they are written. Cache Aside enables only the frequently accessed data to be stored in the cache so as to enhance its working. TTL can also be used in managing stale data. This strategy provides the application better control over what gets cached and is therefore suitable for large set data management.

Code Explain:

function cacheAside(cache, database, key) {
// Step 1: Check if data exists in cache
if (cache[key]) {
console.log(`Cache hit: ${key} -> ${cache[key]}`);
return cache[key]; // Cache hit, return cached data
} else {
console.log(`Cache miss. Fetching from database: ${key}`);
// Step 2: Fetch data from database
if (database[key]) {
const value = database[key];
// Step 3: Manually store data in cache after fetching from database
cache[key] = value;
return value;
} else {
console.log(`Key not found in database: ${key}`);
return null;
}
}
}

// Test Cache Aside
console.log("\nCache Aside Test:");
cacheAside(cache, database, 'user_2'); // Cache miss, fetch from DB
cacheAside(cache, database, 'user_2'); // Cache hit

Steps:

Check Cache: Search for the details in the cache.

Fetch from Database on Miss: Next if the cache do not have the data structure then retrieve it from the database.

Manually Store in Cache: The application saves the data in the cache in a manual way to be used later at a different instance.

Best Practices:

Manual Cache Control: Cache files can be cached based on your specifications; it’s well suited for applications that mandate cache size.

TTL on Cache: In order to put data more active and to eliminate possibility of having the wrong value entries should be transferred with TTL.

3. Write Through Caching

Description: In the Write Through strategy, cache and database are written in parallel and in the synchronous manner. This guarantees that the cache is always up to date, so that the users can read data with low latency while concurrently keeping the data in cache and database synchronized.

Professional Insight: This strategy is particularly helpful in applications to do with finances. Suppose there is a payment processing system where user balances, or transaction histories, should be always up to date. In such systems, the Write Through strategy guarantees that data in the cache and the database are always up to date and this is achieved with a lot of efficiency. Nevertheless, this approach can lead to a write latency increase because each write operation affects both the cache and the database. This overhead can be minimized by increasing disk I/O or using the better disks.

function writeThrough(cache, database, key, value) {
// Step 1: Write to both cache and database simultaneously
database[key] = value; // Write to database
cache[key] = value; // Write to cache
console.log(`Written to both cache and database: ${key} -> ${value}`);
}

// Test Write Through
console.log("\nWrite Through Test:");
writeThrough(cache, database, 'user_4', 'David');
readThrough(cache, database, 'user_4'); // Cache hit after write through

Steps:

Write to Database and Cache: When data changes then update the cache and the database at the same time.

Best Practices:

Consistency: This makes sure that the cache and the database are synchronized at any one given time.

Monitor Write Latency: Similarly with cache and database writers, they can make write operations slower, so check latency, if necessary, optimize it.

4. Write Back Caching

Description: In the Write Back strategy, data is first written into the cache and the actual database is updated in the background separately. This helps to minimize write latency greatly because the application writes the information to the cache and the database at a later time.

Professional Insight: This strategy is particularly suitable for high frequency of write operations, for example news feed or log system where it is not necessary to update the database as soon as possible. But, there is a problem of data loss if the cache is written and crashes before the data is written into the database. Unfortunately, this is where persistent caching solutions like Redis with Append Only File (AOF) can help disarming this risk because it makes sure that data persistence is good even if cache have gone down. Write Back is ideal where many small writes are required but do not need to be immediately synchronized with the base database.

function writeBack(cache, database, key, value) {
// Step 1: Write to cache first
cache[key] = value;
console.log(`Written to cache: ${key} -> ${value}`);

// Step 2: Asynchronously write to the database (simulated with setTimeout)
setTimeout(() => {
database[key] = value;
console.log(`Database updated asynchronously: ${key} -> ${value}`);
}, 1000); // Simulating a delayed DB update
}

// Test Write Back
console.log("\nWrite Back Test:");
writeBack(cache, database, 'user_6', 'Frank');
readThrough(cache, database, 'user_6'); // Cache hit before DB is updated

Steps:

Write to Cache: Data is mostly stored first in the cache when it is being written.

Asynchronous Database Update: The data is changed asynchronously in the background that are outside the application.

Best Practices:

Durability and Risk Mitigation: Store keys and values in persistent caches (for example, Redis with AOF) so that it is not lost in the event that the cache dies before it can update or synchronize with the key-value store’s database.

Batch Writes: In high load systems, you can write a database in batches rather than single writes in order to reduce the performance lag.

5. Write Around

Description: Write Around is a caching technique whereby the data is written directly to the database excluding the cache. It only gets filled during the read operations when the data is being accessed by the cache. This helps in cleaning the cache by only storing data that is most often used by multiple applications.

Professional Insight: This strategy is good for use in systems that have many writes and relatively few reads as in the case of logging systems. Here, Write Around comes handy to avoid storing commonly unused data in the cache making it efficient in operations. It also decreases write latency because not all write operations are processed with the help of the cache. This is good for applications where writes are more important than reads but it’s not necessary that the data has to be read as soon as it has been written.

function writeAround(database, key, value) {
// Step 1: Write directly to the database
database[key] = value;
console.log(`Written only to database: ${key} -> ${value}`);
}

// Test Write Around
console.log("\nWrite Around Test:");
writeAround(database, 'user_5', 'Eve');
readThrough(cache, database, 'user_5'); // Cache miss, fetch from DB and load into cache

Steps:

Write to Database: When writing data, you know should only writing to data base.

Update Cache on Read: If later on, the data is requested and not in cache, Read Through will be used or in Cache Aside, where data is loaded into cache.

Best Practices:

Use for Write-Heavy Workloads: Recommended to be used when there are many write operations than the read ones in a system.

Prevent Cache Pollution: Do not put into the cache information that might take a long time before it is requested again into the database.

Conclusion and Recommendations:

Every caching strategy is appropriate for specific applications. The right strategy depends on the need of the system and how critical it is to have data consistency. High traffic systems that are managing large data flows require special attention when it comes to caching strategies. From my perspective, the integration of strategies is also possible in the complex environment as well. For instance, setting Write Through for important data and Write Around for less important data may be the best strategy.

And…

--

--

Şerifhan Işıklı
Şerifhan Işıklı

Written by Şerifhan Işıklı

Senior Software Engineer @Dogus Teknoloji.

No responses yet