Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / programming / performance

Asynchronous LRU Cache (CLOCK-2-hand) Implementation In NodeJS

5.00/5 (4 votes)
29 Sep 2021GPL319 min read 10.8K  
CLOCK caching (LRU approximation) with O(1) cache hit, up to N asynchronous O(1) cache misses
This is about implementation details of LRU - CLOCK algorithm and how it performs under different key access patterns.

Introduction

Caching is one of the most important optimizations for maintaining high performance in a wide range of applications. In server-side JavaScript (web apps), caching of files and video blobs and fast accession to page templates are vital. OSes (like Ubuntu) have their own file caching already but when there are multiple web apps running on the same server and if one app is streaming a lot of video data, then other apps cache contents may be invalidated frequently and their performance drop.

A dedicated Least Recently Used (cache eviction) algorithm for an app's resources make the app independent of the cache pollution from other applications in the same server.

Background

The basic idea of caching is to serve some (computationally) expensive data from an already (precomputed) list of data. Without caching, the throughput depends on the slowest storage that the data comes from. In case of a file, it is HDD. HDD is too slow ~100MB/s, SSD is still slow ~500MB/s, even NVME is slower (5GB/s) than the cheapest RAM module on the shelf. Caching of file contents on RAM improves not only throughput but also latency. In a web app, especially with NodeJS, the latency of an operation (if in a non-async function) decides when other clients will be served. So, having a LRU cache should improve performance by 2x - 100x depending on the slowest data store performance. Having an asynchronous LRU on the other hand, should hide any latency of a cache-miss.

What are cache miss and cache hit and cache hit ratio? Cache miss is the operation of getting data from slow datastore and assigning it to a victim slot in the cache in both cache-read and cache-write operations. Writing to backing-store is called "write-back". This article starts with simplest form of read-only cache and later adds write-cache capability and optimizations. LRU means least recently used slow is evicted for fresh data. Cache hit is the existence of a key in cache (and in RAM) for a quick accession. Cache hit ratio is the ratio of cache hits to the total (miss+hit) accesses.

Using the Code

First, a single module file for the LRU object constructor definition (latest version: https://github.com/tugrul512bit/LruJS)

(This is the initial version of source code, only contains basic read-caching and focuses on simplicity.)

JavaScript
'use strict';
let Lru = function(cacheSize,callbackBackingStoreLoad,elementLifeTimeMs=1000){
    let me = this;
    let maxWait = elementLifeTimeMs;
    let size = parseInt(cacheSize,10);
    let mapping = {};
    let mappingInFlightMiss = {};
    let buf = [];
    for(let i=0;i<size;i++)
    {
        let rnd = Math.random();
        mapping[rnd] = i;
        buf.push({data:"",visited:false, key:rnd, time:0, locked:false});
    }
    let ctr = 0;
    let ctrEvict = parseInt(cacheSize/2,10);
    let loadData = callbackBackingStoreLoad;
    this.get = function(key,callbackPrm){
       
        let callback = callbackPrm;
        if(key in mappingInFlightMiss)
        {
            setTimeout(function(){
                me.get(key,function(newData){
                    callback(newData);
                });
            },0);
            return;
        }

        if(key in mapping)
        {            
            // RAM speed data
            if((Date.now() - buf[mapping[key]].time) > maxWait)
            {                
                if(buf[mapping[key]].locked)
                {                                        
                    setTimeout(function(){
                        me.get(key,function(newData){
                            callback(newData);
                        });
                    },0);                    
                }
                else
                {
                    delete mapping[key];
                    
                    me.get(key,function(newData){
                        callback(newData);
                    });                    
                }                
            }
            else
            {
                buf[mapping[key]].visited=true;
                buf[mapping[key]].time = Date.now();
                callback(buf[mapping[key]].data);
            }
        }
        else
        {
            // datastore loading + cache eviction
            let ctrFound = -1;
            while(ctrFound===-1)
            {
                if(!buf[ctr].locked && buf[ctr].visited)
                {
                    buf[ctr].visited=false;
                }
                ctr++;
                if(ctr >= size)
                {
                    ctr=0;
                }

                if(!buf[ctrEvict].locked && !buf[ctrEvict].visited)
                {
                    // evict
                    buf[ctrEvict].locked = true;
                    ctrFound = ctrEvict;
                }

                ctrEvict++;
                if(ctrEvict >= size)
                {
                    ctrEvict=0;
                }
            }
            
            mappingInFlightMiss[key]=true;
            let f = function(res){
                delete mapping[buf[ctrFound].key];
                buf[ctrFound] = 
                {data: res, visited:false, key:key, time:Date.now(), locked:false};
                mapping[key] = ctrFound;
                callback(buf[ctrFound].data);
                delete mappingInFlightMiss[key];        
            };
            loadData(key,f);
        }
    };
};

exports.Lru = Lru;

File caching example:

JavaScript
let Lru = require("./lrucache.js").Lru;
let fs = require("fs");
let path = require("path");

let fileCache = new Lru(500, async function(key,callback){
  // cache-miss data-load algorithm
    fs.readFile(path.join(__dirname,key),function(err,data){
      if(err) {                                 
        callback({stat:404, data:JSON.stringify(err)});
      }
      else
      {                                
        callback({stat:200, data:data});
      }                                                        
    });
},1000 /* cache element lifetime */);

The Lru constructor gets parameters (cache size, a cache-miss callback with its own key and callback, cache element lifetime) to construct the CLOCK cache.

The async cache-miss callback works like this:

  • some get() call does not find the key in cache
  • algorithm finds a victim slot
  • runs this callback
    • inside the callback, the expensive computation is done, asynchronously
    • at the end of callback, the callback of callback is called to return data back to LRU cache
  • next time same key is accessed, its data comes from RAM

There is only a get() method for this implementation to maintain simplicity.

JavaScript
fileCache.get("./test.js",function(dat){
     httpResponse.writeHead(dat.stat);
     httpResponse.end(dat.data);
});

Since there is another callback for the result data, it is also possible to run as asynchronous.

How Does It Work?

  1. In most popular LRU implementations, there is always a "mapping" object from keys to cache slots, to achieve O(1) key-search time complexity (in terms of number of cache slots). This one is not different. In fact, JavaScript makes it very easy to do:

    Mapping object:

    JavaScript
    let mapping = {};

    Finding a (string/integer) key in mapping:

    JavaScript
    if(key in mapping)
    {
       // key found, get data from RAM
    }

    It is efficient (O(1)) and simple (at least much simpler than C++ version).

  2. Once mapping leads to a cache slot, it is straightforward to get the data out of it:
    JavaScript
    buf[mapping[key]].visited=true; 
    buf[mapping[key]].time = Date.now(); 
    callback(buf[mapping[key]].data);

    The visited flag is to inform CLOCK hands (ctr and ctrEvict) to save this slot from possible eviction. The time field is used for life-time management of slots. Every access (with a cache-hit) renews the time field so that it can stay in cache.

    The callback function is the user-given function for get() method to retrieve the data of cache slot.

  3. Before getting data from mapped slot directly, its lifetime is measured. If it's over, it is invalidated by deleting its mapping and retrying with same key (that makes a cache-miss):
    JavaScript
    if((Date.now() - buf[mapping[key]].time) > maxWait)
    {
        delete mapping[key];
        me.get(key,function(newData){
            callback(newData);
        });
    }

    Since the mapping is deleted, no other asynchronous access interferes with its internal state.

  4. If key is not found in mapping object, the LRU eviction (actually its approximation, CLOCK with 2 hands (aka Second Chance)) logic runs. The search for a victim is easy:
    JavaScript
    let ctrFound = -1;
    while(ctrFound===-1)
    {
        if(!buf[ctr].locked && buf[ctr].visited)
        {
            buf[ctr].visited=false;
        }
        ctr++;
        if(ctr >= size)
        {
            ctr=0;
        }
    
        if(!buf[ctrEvict].locked && !buf[ctrEvict].visited)
        {
            // evict
            buf[ctrEvict].locked = true;
            ctrFound = ctrEvict;
        }
    
        ctrEvict++;
        if(ctrEvict >= size)
        {
            ctrEvict=0;
        }
    }

    The first "if" block checks if a slot pointed by "second chance" hand (ctr) is not locked and is visited. If it is, then as a second chance, it does not evict but marks it as non-visited.

    The third "if" block checks if a slot pointed by "eviction" hand (ctrEvict) is not locked and is not visited. If it is, then the slot is marked as "locked" (for protection against asynchronous accesses to get() method) and the eviction slot is found and the loop ends.

    Observe that the two hands ctr and ctrEvict are initialized with 50% phase difference:

    JavaScript
    let ctr = 0;
    let ctrEvict = parseInt(cacheSize/2,10);

    and in the "while" loop, they are incremented equally. This means that one always follows the other circularly, to constantly check each other's results. The more cache slots, the better victim slot search. A key has at least N/2 clock hand movements to be saved from eviction.

  5. After the victim slot is found, mapping is deleted for future async collisions and recreated after the datastore loading:
    JavaScript
    mappingInFlightMiss[key]=true; 
    let f = function(res){ 
        delete mapping[buf[ctrFound].key]; 
        buf[ctrFound] = {data: res, visited:false, key:key, time:Date.now(), locked:false}; 
        mapping[key] = ctrFound; 
        callback(buf[ctrFound].data); 
        delete mappingInFlightMiss[key]; 
    }; 
    
    loadData(key,f);

    Since the user-given cache-miss datastore-loading function (loadData) can be asynchronous, there can be at most N in-flight cache-misses for this cache. This is an optimization to hide up to N cache miss latencies. Hiding latency is important for having high throughput (especially in web apps for thousands of visitors per second). Having more than N asynchronous cache misses / accesses causes dead-lock, so a cache with 100 slots can serve up to 100 users asynchronously. Or it can be throttled to a lower value (M) than N and do computing in multiple(K) passes (where M x K = total accesses).

    Since cache misses can be hidden and cache hits are RAM-speed, the overall performance looks like O(1) for both cache-miss and cache-hit. But in per-element access, there are possibly multiple clock hand iterations per access, especially when there are only few slots in cache. When number of slots is inceased, it approaches O(1).

    Inside this loadData callback, new slot data has its locked field set to false and this makes the slot available for other asynchronous accesses.

  6. If there is a cache hit and if the slot found is locked and if its life time is expired, the access operation is delayed with setTimeout with 0 time parameter (this is more CPU-friendlier than setImmediate) towards end of JavaScript's message queue. The probability of the locked operation(cache-miss) ending before the setTimeout is 100% and in terms of time complexity, this still counts as O(1) with a bigger latency but it is hidden behind the awaited latency of locked operation.
    JavaScript
    if(buf[mapping[key]].locked) 
    { 
        setTimeout(function(){ 
            me.get(key,function(newData){ 
                callback(newData); 
            }); 
        },0); 
    } 
  7. Lastly, if a key is in in-flight cache-miss mapping, it is postponed to a later position of message queue by setTimeout:
    JavaScript
    if(key in mappingInFlightMiss)
    {
    
      setTimeout(function(){
         me.get(key,function(newData){
                  callback(newData);
         });
      },0);
      return;
    }

    This way, duplicating data is evaded.

Benchmarking

Asynchronous Cache-Miss Benchmark

JavaScript
"use strict";
// number of asynchronous accessors(1000 here) need to be equal to or less than 
// cache size(1000 here) or it makes dead-lock
let Lru = require("./lrucache.js").Lru;

let cache = new Lru(1000, async function(key,callback){
    // cache-miss data-load algorithm
    setTimeout(function(){
        callback(key+" processed");
    },1000);
},1000 /* cache element lifetime */);

let ctr = 0;
let t1 = Date.now();
for(let i=0;i<1000;i++)
{
    cache.get(i,function(data){
        console.log("data:"+data+" key:"+i);
        if(i.toString()+" processed" !== data)
        {
            console.log("error: wrong key-data mapping.");
        }
        if(++ctr === 1000)
        {
            console.log("benchmark: "+(Date.now()-t1)+" miliseconds");
        }
    });
}

To not cause any deadlock, LRU size is chosen as 1000 or the for loop is allowed to iterate for only 1000 times.

Output:

JavaScript
benchmark: 1127 miliseconds

Since each cache-miss has 1000 miliseconds latency, loading 1000 elements synchronously would cause 15 minutes but with the overlapped cache-misses it is much quicker. This is useful especially in I/O heavy workloads like streaming data from HDD or network.

Cache-Hit Ratio Benchmark

10% hit rate:

  • key generation: random, 10000 different values possible
  • 1000 slots
JavaScript
"use strict";
// number of asynchronous accessors(1000 here) need to be equal to or less than 
// cache size(1000 here) or it makes dead-lock
let Lru = require("./lrucache.js").Lru;

let cacheMiss = 0;
let cache = new Lru(1000, async function(key,callback){
    cacheMiss++;
    // cache-miss data-load algorithm
    setTimeout(function(){
        callback(key+" processed");
    },100);
},100000000 /* cache element lifetime */);

let ctr = 0;
let t1 = Date.now();
let asynchronity = 500;
let benchRepeat = 100;
let access = 0;

function test()
{
    ctr = 0;
    for(let i=0;i<asynchronity;i++)
    {
        let key = parseInt(Math.random()*10000,10); // 10% hit ratio
        cache.get(key.toString(),function(data){     
            access++;
            if(key.toString()+" processed" !== data)
            {
                console.log("error: wrong key-data mapping.");
            }
            if(++ctr === asynchronity)
            {
                console.log("benchmark: "+(Date.now()-t1)+" miliseconds");
                console.log("cache hit: "+(access - cacheMiss));
                console.log("cache miss: "+(cacheMiss));
                console.log("cache hit ratio: "+((access - cacheMiss)/access));
                if(benchRepeat>0)
                {
                    benchRepeat--;
                    test();
                }
            }
        });
    }
}

test();

Result:

benchmark: 10498 miliseconds
cache hit: 6151
cache miss: 44349
cache hit ratio: 0.1218019801980198

Since benchmark is made in 100 steps with 100 ms latency per cache-miss, it resulted 10 seconds (close to 100 x 100 milliseconds). Hit ratio is close to expected value of 10%.

50% Hit Ratio Test

JavaScript
let key = parseInt(Math.random()*2000,10); // 50% hit ratio

Result:

benchmark: 10418 miliseconds
cache hit: 27541
cache miss: 22959
cache hit ratio: 0.5453663366336634

Again, close to the expected 50% ratio.

99% Hit Ratio Test

JavaScript
let key = parseInt(Math.random()*1010,10); // 99% hit ratio

Result:

benchmark: 10199 miliseconds
cache hit: 49156
cache miss: 1344
cache hit ratio: 0.9733861386138614 

Randomness of keys causing the 0.9733 ratio.

100% Hit Ratio Test

JavaScript
let key = parseInt(Math.random()*999,10); // 100% hit ratio

Result:

JavaScript
benchmark: 1463 miliseconds
cache hit: 49501
cache miss: 999
cache hit ratio: 0.9802178217821782

After the first step of benchmark (which cannot evade cache misses), everything came from RAM and reduced the total latency a lot.

Points of Interest

It is a bit hard to see bugs in asynchronous codes. To overcome this issue, I had to use a debugger object that recorded all slot values in every function call with their respective Date.now() time and then sort the record array on the time field of each element and look at the path of every key access. This is a bit exhausting but is easier than just looking at the screen and following N paths just by eye.

Edit: With "update" method, cache items can be re-fetched from backing store in case of a "write" operation. Latest version (~160 lines of code) with some optimization on asynchronity ( and solves deadlock problem):

JavaScript
'use strict';
/*
cacheSize: number of elements in cache, constant, 
must be greater than or equal to number of asynchronous accessors / cache misses
callbackBackingStoreLoad: user-given cache-miss function to load data from datastore
elementLifeTimeMs: maximum miliseconds before an element is invalidated, 
only invalidated at next get() call with its key
reload: evicts all cache to reload new values from backing store
reloadKey: only evicts selected item (to reload its new value on next access)
*/

let Lru = function(cacheSize,callbackBackingStoreLoad,elementLifeTimeMs=1000){
    const me = this;
    
    const maxWait = elementLifeTimeMs;
    const size = parseInt(cacheSize,10);
    const mapping = {};
    const mappingInFlightMiss = {};
    const bufData = new Array(size);
    const bufVisited = new Uint8Array(size);
    const bufKey = new Array(size);
    const bufTime = new Float64Array(size);
    const bufLocked = new Uint8Array(size);
    for(let i=0;i<size;i++)
    {
        let rnd = Math.random();
        mapping[rnd] = i;
        
        bufData[i]="";
        bufVisited[i]=0;
        bufKey[i]=rnd;
        bufTime[i]=0;
        bufLocked[i]=0;
    }
    let ctr = 0;
    let ctrEvict = parseInt(cacheSize/2,10);
    const loadData = callbackBackingStoreLoad;
    let inFlightMissCtr = 0;
    // refresh all cache content
    this.reload=function(){
        for(let i=0;i<size;i++)
        {
            bufTime[i]=0;
        }
    };
    // refresh item in cache by triggering eviction
    this.reloadKey=function(key){
        if(key in mapping)
        {
            bufTime[mapping[key]]=0;
        }
    };
    this.get = function(keyPrm,callbackPrm){
        const key = keyPrm;
        const callback = callbackPrm;
        
        // stop dead-lock when many async get calls are made
        if(inFlightMissCtr>=size)
                 {
                       setTimeout(function(){
                me.get(key,function(newData){
                    callback(newData);
                });
            },0);
                       return;
             }
        
        // delay the request towards end of the cache-miss completion
        if(key in mappingInFlightMiss)
        {

            setTimeout(function(){
                me.get(key,function(newData){
                    callback(newData);
                });
            },0);
            return;
        }

        if(key in mapping)
        {
            let slot = mapping[key];
            // RAM speed data
            if((Date.now() - bufTime[slot]) > maxWait)
            {
                
                if(bufLocked[slot])
                {                                        
                    setTimeout(function(){
                        me.get(key,function(newData){
                            callback(newData);
                        });
                    },0);                    
                }
                else
                {
                    delete mapping[key];
                    
                    me.get(key,function(newData){
                        callback(newData);
                    });                    
                }                
            }
            else
            {
                bufVisited[slot]=1;
                bufTime[slot] = Date.now();
                callback(bufData[slot]);
            }
        }
        else
        {
            // datastore loading + cache eviction
            let ctrFound = -1;
            while(ctrFound===-1)
            {
                // give slot a second chance before eviction
                if(!bufLocked[ctr] && bufVisited[ctr])
                {
                    bufVisited[ctr]=0;
                }
                ctr++;
                if(ctr >= size)
                {
                    ctr=0;
                }

                // eviction conditions
                if(!bufLocked[ctrEvict] && !bufVisited[ctrEvict])
                {
                    // evict
                    bufLocked[ctrEvict] = 1;
                    inFlightMissCtr++;
                    ctrFound = ctrEvict;
                }

                ctrEvict++;
                if(ctrEvict >= size)
                {
                    ctrEvict=0;
                }
            }
            
            mappingInFlightMiss[key]=1;
            let f = function(res){
                delete mapping[bufKey[ctrFound]];

                bufData[ctrFound]=res;
                bufVisited[ctrFound]=0;
                bufKey[ctrFound]=key;
                bufTime[ctrFound]=Date.now();
                bufLocked[ctrFound]=0;

                mapping[key] = ctrFound;
                callback(bufData[ctrFound]);
                inFlightMissCtr--;
                delete mappingInFlightMiss[key];        
            };
            loadData(key,f);
        }
    };
};

exports.Lru = Lru;

With this version, dead lock will not occur forever but only until the number of in-flight-item fetchings is reduced to N. Refreshing an item:

JavaScript
cache.reloadKey(someKey);

does not immediately write the new value to cache. It only occurs when the item is requested by get method and in asynchronously to all other cache-misses in-flight. This makes better latency-hiding than refreshing on demand but overall latency per item may increase due to postponing of refreshing to next "get" access.

What About Callback-Hell?

Since this is a callback based cache, requesting multiple datas at once in a common synchronization point would cause a code-base like this:

JavaScript
cache.get(key1, function(data1){
    cache.get(key2, function(data2){
         cache.get(key3, function(data3)){
              ...
         };
    });
});

It becomes even worse when a pixel value (image processing app) requires neighboring 25 pixel values. Even minifying the code would look pretty compared to this. Solution to this problem is simple:

JavaScript
this.getMultiple = function(callback, ... keys){
    let result = [];
    let ctr = keys.length;
    for(let i=0;i<ctr;i++)
        result.push(0);
    let ctr2 = 0;
    keys.forEach(function(key){
        let ctr3=ctr2++; // keep order of outputs

        // async operation
        me.get(key,function(data){
            result[ctr3] = data;
            ctr--;
            if(ctr==0)
            {
                callback(result);
            }
        });
    });
};

Here, variable-argument function is used to let user request as many values as needed by just coma-separated keys after the callback. Since there is no atomic operations or multiple threads like C++, it only requires to decrement a counter to check if all operations are complete.

Each operation gets a value for a key and increments the counter. Once it gets the last asynchronous data, the ctr counter is decremented to zero. Then callback is called with the result array. Usage looks like this:

JavaScript
cache.getMultiple(function(dataArray){
    console.log(dataArray); // [.., .., .., .., ..]
},"a key",5,"another key",someKeyVariable,"my last key");

This can be further optimized but for the looks for users. Promises in JavaScript are good for this. They can be constructed as simple as this:

JavaScript
this.getAwaitable = function(key){
    return new Promise( (success,fail)=>{
         me.get(key, function(data){ success(data); });
    });
};

Then it looks like this from user's space:

JavaScript
let mySynchronizedValue = await cache.getAwaitable(myKey);

but this breaks the aim of asnychronous caching. Await makes the (single) thread of JavaScript running the app wait for the value and new cache-get operations cannot be launched in the same scope. To optimize a bit, user can write them like this:

JavaScript
// all asynchronous, starts working as soon as their Promise is created
let myAsync1 = cache.getAwaitable(myKey1);
let myAsync2 = cache.getAwaitable(myKey2);
let myAsync3 = cache.getAwaitable(myKey3);

// synchronized between each other, only waits completion of all 3 operations
let mySync1 = await myAsync1;
let mySync2 = await myAsync2;
let mySync3 = await myAsync3;

Also, usage (allocation / freeing resources) of Promises in a tight loop is slower than pure callbacks because Promises do more things than callbacks in the background and they are usually "await"ed right after they are constructed (which decreases concurrency compared to the free-running callbacks). The following code is the latest source from github that contains more optimizations and features:

JavaScript
'use strict';
/*
cacheSize: number of elements in cache, constant, 
must be greater than or equal to number of asynchronous accessors / cache misses
callbackBackingStoreLoad: user-given cache-miss function to load data from datastore
elementLifeTimeMs: maximum miliseconds before an element is invalidated, 
only invalidated at next get() call with its key
reload: evicts all cache to reload new values from backing store
reloadKey: only evicts selected item (to reload its new value on next access)
*/

let Lru = function(cacheSize,callbackBackingStoreLoad,
                   elementLifeTimeMs=1000,callbackBackingStoreSave){
    const me = this;
    const aTypeGet = 0;
    const aTypeSet = 1;
    const maxWait = elementLifeTimeMs;
    const size = parseInt(cacheSize,10);
    const mapping = new Map();
    const mappingInFlightMiss = new Map();
    const bufData = new Array(size);
    const bufVisited = new Uint8Array(size);

    const bufEdited = new Uint8Array(size);

    const bufKey = new Array(size);
    const bufTime = new Float64Array(size);
    const bufLocked = new Uint8Array(size);
    for(let i=0;i<size;i++)
    {
        let rnd = Math.random();
        mapping.set(rnd,i);
        
        bufData[i]="";
        bufVisited[i]=0;
        bufEdited[i]=0;
        bufKey[i]=rnd;
        bufTime[i]=0;
        bufLocked[i]=0;
    }
    let ctr = 0;
    let ctrEvict = parseInt(cacheSize/2,10);
    const loadData = callbackBackingStoreLoad;
    const saveData = callbackBackingStoreSave;
    let inFlightMissCtr = 0;
    // refresh all items time-span in cache
    this.reload=function(){
        for(let i=0;i<size;i++)
        {
            bufTime[i]=0;
        }
    };
    // refresh item time-span in cache by triggering eviction
    this.reloadKey=function(key){
        if(mapping.has(key))
        {
            bufTime[mapping[key]]=0;
        }
    };

    // get value by key
    this.get = function(keyPrm,callbackPrm){
        // aType=0: get
        access(keyPrm,callbackPrm,aTypeGet);
    };

    // set value by key (callback returns same value)
    this.set = function(keyPrm,valuePrm,callbackPrm){
        // aType=1: set
        access(keyPrm,callbackPrm,aTypeSet,valuePrm);
    };
    
    // aType=0: get
    // aType=1: set
    function access(keyPrm,callbackPrm,aType,valuePrm){
        
        const key = keyPrm;
        const callback = callbackPrm;
        const value = valuePrm;
        // stop dead-lock when many async get calls are made
        if(inFlightMissCtr>=size)
                 {
                       setTimeout(function(){
                // get/set
                access(key,function(newData){
                    callback(newData);
                },aType,value);
            },0);
                       return;
             }
        
        // if key is busy, then delay the request towards end of the cache-miss completion
        if(mappingInFlightMiss.has(key))
        {
            
            setTimeout(function(){
                // get/set
                access(key,function(newData){
                    callback(newData);
                },aType,value);
            },0);
            return;
        }

        if(mapping.has(key))
        {
            // slot is an element in the circular buffer of CLOCK algorithm
            let slot = mapping.get(key);

            // RAM speed data
            if((Date.now() - bufTime[slot]) > maxWait)
            {
                
                // if slot is locked by another operation, postpone the current operation
                if(bufLocked[slot])
                {                                        
                    setTimeout(function(){
                        access(key,function(newData){
                            callback(newData);
                        },aType,value);
                    },0);
                    
                }
                else // slot is not locked and its lifespan has ended
                {
                    // if it was edited, update the backing-store first
                    if(bufEdited[slot] == 1)
                    {
                        bufLocked[slot] = 1;
                        bufEdited[slot]=0;
                        mappingInFlightMiss.set(key,1); // lock key
                        inFlightMissCtr++;
                        // update backing-store, this is async
                        saveData(bufKey[slot],bufData[slot],function(){
                            mappingInFlightMiss.delete(key);    // unlock key
                            bufLocked[slot] = 0;
                            inFlightMissCtr--;

                            mapping.delete(key); // disable mapping for current key
                            
                            // re-simulate the access, async
                            access(key,function(newData){
                                callback(newData);
                            },aType,value);

                        });
                    }
                    else
                    {
                        mapping.delete(key); // disable mapping for current key
                        access(key,function(newData){
                            
                            callback(newData);
                        },aType,value);
                    }
                }
                
            }
            else    // slot life span has not ended
            {
                bufVisited[slot]=1;
                bufTime[slot] = Date.now();

                // if it is a "set" operation
                if(aType == aTypeSet)
                {    
                    bufEdited[slot] = 1; // later used when data needs to be written 
                                         // to data-store (write-cache feature)
                    bufData[slot] = value;
                }
                callback(bufData[slot]);
            }
        }
        else
        {
            // datastore loading + cache eviction
            let ctrFound = -1;
            let oldVal = 0;
            let oldKey = 0;
            while(ctrFound===-1)
            {
                // give slot a second chance before eviction
                if(!bufLocked[ctr] && bufVisited[ctr])
                {
                    bufVisited[ctr]=0;
                }
                ctr++;
                if(ctr >= size)
                {
                    ctr=0;
                }

                // eviction conditions
                if(!bufLocked[ctrEvict] && !bufVisited[ctrEvict])
                {
                    // eviction preparations, lock the slot
                    bufLocked[ctrEvict] = 1;
                    inFlightMissCtr++;
                    ctrFound = ctrEvict;
                    oldVal = bufData[ctrFound];
                    oldKey = bufKey[ctrFound];
                }

                ctrEvict++;
                if(ctrEvict >= size)
                {
                    ctrEvict=0;
                }
            }
            
            // user-requested key is now asynchronously in-flight & 
            // locked for other operations
            mappingInFlightMiss.set(key,1);
            
            // eviction function. least recently used data is gone, 
            // newest recently used data is assigned
            let evict = function(res){

                mapping.delete(bufKey[ctrFound]);

                bufData[ctrFound]=res;
                bufVisited[ctrFound]=0;
                bufKey[ctrFound]=key;
                bufTime[ctrFound]=Date.now();
                bufLocked[ctrFound]=0;

                mapping.set(key,ctrFound);
                callback(res);
                inFlightMissCtr--;
                mappingInFlightMiss.delete(key);
            
            };

            // if old data was edited, send it to data-store first, then fetch new data
            if(bufEdited[ctrFound] == 1)
            {
                if(aType == aTypeGet)
                    bufEdited[ctrFound] = 0;

                // old edited data is sent back to data-store
                saveData(oldKey,oldVal,function(){
                    if(aType == aTypeGet)
                        loadData(key,evict);
                    else if(aType == aTypeSet)
                        evict(value);
                });
            }
            else
            {
                if(aType == aTypeSet)
                    bufEdited[ctrFound] = 1;
                if(aType == aTypeGet)
                    loadData(key,evict);
                else if(aType == aTypeSet)
                    evict(value);    
            }
        }
    };

    this.getAwaitable = function(key){
        return new Promise(function(success,fail){
            me.get(key,function(data){
                success(data);
            });
        });
    }

    this.setAwaitable = function(key,value){
        return new Promise(function(success,fail){
            me.set(key,value,function(data){
                success(data);
            });
        });
    }

    this.getMultiple = function(callback, ... keys){
        let result = [];
        let ctr1 = keys.length;
        for(let i=0;i<ctr1;i++)
            result.push(0);
        let ctr2 = 0;
        keys.forEach(function(key){
            let ctr3 = ctr2++;
            me.get(key,function(data){
                result[ctr3] = data;
                ctr1--;
                if(ctr1==0)
                {
                    callback(result);
                }
            });
        });
    };

    this.setMultiple = function(callback, ... keyValuePairs){
        let result = [];
        let ctr1 = keys.length;
        for(let i=0;i<ctr1;i++)
            result.push(0);
        let ctr2 = 0;
        keyValuePairs.forEach(function(pair){
            let ctr3 = ctr2++;
            me.set(pair.key,pair.value,function(data){
                result[ctr3] = data;
                ctr1--;
                if(ctr1==0)
                {
                    callback(result);
                }
            });
        });
    };

    this.getMultipleAwaitable = function(... keys){
        return new Promise(function(success,fail){
            me.getMultiple(function(results){
                success(results);
            }, ... keys);
        });
    };

    this.setMultipleAwaitable = function(... keyValuePairs){
        return new Promise(function(success,fail){
            me.setMultiple(function(results){
                success(results);
            }, ... keyValuePairs);
        });
    };

    // push all edited slots to backing-store and reset all slots lifetime 
    // to "out of date"
    this.flush = function(callback){
        let ctr1 = 0;
        function waitForReadWrite(callbackW){

            // if there are in-flight cache-misses cache-write-misses 
            // or active slot locks, then wait
            if(mappingInFlightMiss.size > 0 || 
               bufLocked.reduce((e1,e2)=>{return e1+e2;}) > 0)
            {
                setTimeout(()=>{ waitForReadWrite(callbackW); },0);
            }
            else
                callbackW();
        }
        waitForReadWrite(function(){  
            // flush all slots
            for(let i=0;i<size;i++)
            {
                bufTime[i]=0;
                if(bufEdited[i] == 1)
                    ctr1++;
            }

            for(let i=0;i<size;i++)
            {
                if(bufEdited[i] == 1)
                {        
                    // async
                    me.set(bufKey[i],bufData[i],function(val){
                        ctr1--;
                        if(ctr1 == 0)
                        {
                            callback(); // flush complete
                        }
                    });
                }
            }
        });
    };
};

exports.Lru = Lru;

The following benchmark results are from an old version where mapping was maintained by simple { } objects.

With fully asynchronous value requests on thousands of files, file chunks, http requests and other tasks, the performance disadvantage of JavaScript language is minimized.

More Optimizations

After making the cache work right, more optimizations can be made. One of them is not using simple objects for mapping things. (Last version of complete source code is at the end of the article after benchmarking part.)

JavaScript
let mapping = {}; // sub-optimal performance, sometimes breaks running

Because this stops a program from entering "__proto__" or similar JavaScript-specific strings as keys. Instead of fine-tuning the basic object against these failures, it should be handled by the data-structure that was made for this.

JavaScript
let mapping = new Map(); // faster than {} and even objects can be keys now

Since Map is implemented as either a hashtable or red-black tree, its get/set/has/size methods are optimized for amortized O(1) time complexity. When using a Map, it is important to not mix it with normal object usage. One may accidentally do this:

JavaScript
let mapping = new Map();
mapping[aKey] = aValue; // this sets object { } properties instead

this doesn't use its optimized methods. They must be directly used like this:

JavaScript
let mapping = new Map();
mapping.set(aKey,aValue)); // optimized for frequent calls
mapping.get(aKey); // optimized for frequent calls

another speciality of Map is having a higher max cache size due to being more efficient on memory footprint than simple objects. Ability to hold more cache items makes it useful for different kinds of jobs too. Managing a medium sized database is possible with this.

Even though the Map is better for memory efficiency, the number of in-flight gets/sets increases temporary memory consumption. With a cache size of 25 elements (32bit integer keys, 64bit floating point values) and concurrency of 5, NodeJs app consumes up to 45 MB for processing 400x400 image data. When concurrency is 800000, memory consumption starts at 105MB and grows to 225MB, despite having only 25 elements in cache because 800k items are waiting in queue to reach 25 slots of cache. Allocating 1-million items(cache slots) for cache and having 5 concurrency (5 pixels requested at a time), memory consumption is 140MB-280MB depending on garbage-collector activity. When concurrency is increased to 800000 (all loop iterations made asynchronous), requirement increases to 370 MB. Cache-miss functions in-flight cause more memory requirement but is negligible when concurrency is not too high.

1 million element cache requiring at least 140MB is equivalent to 140 bytes per cache-item. This is the cost of book-keeping for asynchronous operations to hide multiple backing-store latencies (for increased performance / ease of use) while staying at thousands of operations per millisecond throughput.

To compare with some in-memory databases that have their own caching:

  • Redis 1million small items cost 85MB memory
  • Hazelcast 1million items cost at least 50MB memory

To compare with other Github repositories of LRU cache algorithms:

After exchanging plain objects for Maps, cache's construction part starts with this:

JavaScript
const mapping = new Map();               // key --> value mapping
const mappingInFlightMiss = new Map();   // key --> busy? mapping 
const bufData = new Array(size);          // value of cache slots
const bufVisited = new Uint8Array(size);  // second chance status of cache slots
const bufEdited = new Uint8Array(size);   // edited status of cache slots
const bufKey = new Array(size);           // keys of cache slots
const bufTime = new Float64Array(size);   // life-span info of cache slots
const bufLocked = new Uint8Array(size);   // slot busy status of cache slots

In the first version, there was only a buf generic array to hold all status info of all cache slots. But here, all variables of a "cache slot" (circular buffer element) are separated to stop unnecessarily loading a whole slot everytime a 8bit value is required. This helps low-end systems keep minimum performance at a higher point and makes CPU-cache less polluted which is good for all kinds of systems. Test system has only 9-10 GB/s memory bandwidth and benefits from this optimization.

Write-Caching

Next important thing is to offload "write" operations onto cache. Before this, if a user had to update the backing-store, this was needed to be done directly (in a slow way) first, then a call to reloadKey() on the cache had to be done to get latest information by get() method. On a product gallery web-site, it is not that important as the product information / user information / web page contents / attached images are not updated as frequently as clients visiting the web-site. But once the data update is required to be performant, caching is a good option. Write-cache can be useful in many applications from forums to open-world games. Especially in open-world games, when player changes world data, it needs to be remembered when player visits same place again, but without overflowing RAM and faster than HDD.

Adding a "set" method to the cache not only reduces extra API calls like reloadKey() but also improves performance of writes similar to the reads because it uses same LRU eviction with only few changes to the code. To add the write-cache feature into the algorithm, an extra status field called "bufEdited" is required.

The "edited" status of a cache slot is changed to 1 whenever data is written to it. It is set to 0 when the data is updated on the backing-store. Edited is equivalent to "needing an update on backing-store" or in other words, the "dirty bit".

Since both write and read operations run on same eviction algorithm, there is no need to duplicate eviction-specific codes. The get()/set() methods become wrappers of a hidden access() method. Access method takes more parameters to select get or set algorithm:

JavaScript
function access(keyPrm,callbackPrm,aType,valuePrm){ .. }

The aType parameter selects the operation type. If its aTypeGet (0), then it runs a get operation, otherwise (aTypeSet) it is a set operation. Then there is valuePrm parameter for the set operation that assigns valuePrm to the data slot with keyPrm key.

Also cache does not know what backing-store is and how writing is done. It is given by user similar to the cache-miss function given to the constructor:

JavaScript
let cache = new Lru(cacheSize,
                    callbackBackingStoreLoad,
                    elementLifeTimeMs,
                    callbackBackingStoreSave);

callbackBackingStoreSave function is called by cache whenever a cache slot has "edited" status set and needs to update the backing-store. Cache calls it with 3 parameters: Key, value and a callback.

JavaScript
let cache = new Lru(cacheSize,
                    callbackBackingStoreLoad,
                    elementLifeTimeMs,
                    function(key,value,callback){
                           backing_store[key] = value;
                           callback();
});

This user-defined function is the data-storing part of eviction algorithm. callbackBackingStoreLoad is the read-cache-miss eviction's data-loading part. Since a generic cache cannot know what to do with an SSD or a database, the user gives the necessary details by callbacks.

The first part of implementation that does a "write" on the backing-store is the out-of-date branch (when cache item's time has expired):

JavaScript
...
else // slot is not locked and its lifespan has ended
{
          // if it was edited, update the backing-store first
          if(bufEdited[slot] == 1)
          {
              bufLocked[slot] = 1;
              bufEdited[slot]=0;
              mappingInFlightMiss.set(key,1); // lock key
              inFlightMissCtr++;
              // update backing-store, this is async
              saveData(bufKey[slot],bufData[slot],function(){
                  mappingInFlightMiss.delete(key);    // unlock key
                  bufLocked[slot] = 0;
                  inFlightMissCtr--;

                  mapping.delete(key); // disable mapping for current key
                            
                  // re-simulate the access, async
                  access(key,function(newData){
                      callback(newData);
                  },aType,value);

              });
          }
          else // not edited, can be discarded
          {
              mapping.delete(key); // disable mapping for current key
              access(key,function(newData){
                            
                  callback(newData);
              },aType,value);
          }
}

What this part does:

  • stops other in-flight operations from entering same cache slot or using same key
    • this is important for evading duplicates
  • clear edited status
    • because it will be written to the backing-store soon
  • save the data to backing-store (possibly asynchronously for a database or a file)
    • user-defined function
  • after writing is complete, unlock the key and slot
  • remove the key from mapping to "free" the slot for eviction purposes
  • re-issue the same access (get/set)
    • the re-issued access finds another slot to work with

The second part where a write occurs is near the eviction:

JavaScript
// if old data was edited, send it to data-store first, then fetch new data
if(bufEdited[ctrFound] == 1)
{
       if(aType == aTypeGet)
            bufEdited[ctrFound] = 0;

       // old edited data is sent back to data-store
       saveData(oldKey,oldVal,function(){
            if(aType == aTypeGet)
                  loadData(key,evict);
            else if(aType == aTypeSet)
                  evict(value);
       });
}
else
{
      if(aType == aTypeSet)
            bufEdited[ctrFound] = 1;
      if(aType == aTypeGet)
            loadData(key,evict);
      else if(aType == aTypeSet)
            evict(value);    
}

Eviction is simple. Finding a victim slot has not changed but the data updating part is changed.

  • if slot was edited,
    • clear the edited status if it is a get() operation
    • save data to backing-store
    • load new data if it is a get() operation, then evict
    • just evict if it is a set() operation
  • else
    • set the edited status if it is a set() operation
    • load data if its get(), then evict
      • just evict if its set()

Since cache is responsible for the latest written bits of data as well as very old written ones, a flush() method is required to finally write all of its slots with edited status set (but not evicted yet). This cache implementation is passive so it only works when get and set are called. Due to this reason, some data by old calls to set() may not reach to the backing-store. This is maintained by a flush() call.

JavaScript
    // push all edited slots to backing-store and reset all slots lifetime to "out of date"
    this.flush = function(callback){
      
        function waitForReadWrite(callbackW){

            // if there are in-flight cache-misses cache-write-misses or active slot locks, then wait
            if(mappingInFlightMiss.size > 0 || bufLocked.reduce((e1,e2)=>{return e1+e2;}) > 0)
            {
                setTimeout(()=>{ waitForReadWrite(callbackW); },0);
            }
            else
                callbackW();
        }
        waitForReadWrite(function(){  

            for(let i=0;i<size;i++)
            {
                bufTime[i]=0;
                if(bufEdited[i] == 1)
                {        
                    // synced flushing = better stability when cache is big
                    await me.setAwaitable(bufKey[i],bufData[i]);
                }
            }
            callback(); // flush complete
        });
    };
};
  • waits for all in-flight cache-misses (of all set and get calls)
  • simulates set operations on all edited slots to indirectly update the backing-store

When cache is put in front of a database (to make database accessed less frequently for writes/reads), the latest unwritten data should be flushed to database before closing the application or after certain intervals. All operations issued before the flush are guaranteed to be sent to the backing-store but anything issued during a flush is not.

Benchmark

Test system: FX8150 @ 2.1GHz, 1-channel DDR3 1600 MHz 4GB RAM, Ubuntu 18.04 LTS 64bit.

Peak Read-Cache-Hit Performance Test

  • Number of reads: 1million
  • Concurrency: 5000
  • Cache-size: 1000 elements
  • Access: constant index (5) to force cache-read-hit on all except first access
JavaScript
"use strict";

let N_bench_repeats = 10;
let N_access =      1000000;
let N_concurrency = 5000;

let Lru = require("./lrucache.js").Lru;

let cache_elements = 1000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){

    // simulated slow access
    setTimeout(function(){
        callback(Math.random() /* database[key] */);
    },100);
   
},cache_element_expire_ms, async function(key,value,callback){
    // simulated slow access
    setTimeout(function(){
        // database[key]=value;
        callback();
    },100);
});

async function benchmark(){

    let request_keys = [];
    for(let j=0;j<N_concurrency;j++)
    {
        request_keys.push(5);
    }

    let bench_time = Date.now();

    for(let i=0;i<N_access;i+=N_concurrency)
    {
        /* constant key = only 1 database-read should happen */
        let values = await cache.getMultipleAwaitable(... request_keys);    
    }

    console.log("run-time: "+(Date.now() - bench_time)+"ms");
    console.log("------------------------------------------");
}

function repeat(){
    N_bench_repeats--;
    if(N_bench_repeats>0)
        benchmark().then(repeat);
}

repeat();

Output:

run-time: 926ms
------------------------------------------
run-time: 705ms
------------------------------------------
run-time: 676ms
------------------------------------------
run-time: 682ms
------------------------------------------
run-time: 679ms
------------------------------------------
run-time: 681ms
------------------------------------------
run-time: 682ms
------------------------------------------
run-time: 685ms
------------------------------------------
run-time: 679ms
------------------------------------------

680 milliseconds for 1million reads is equivalent to 1470 reads per millisecond.

Peak Write-Cache-Hit Performance Test

Same as the read version with the difference being setMultipleAwaitable method call and key-value pair array given to the method:

JavaScript
async function benchmark(){

    let pairs = [];
    for(let j=0;j<N_concurrency;j++)
    {
        pairs.push({key:5, value:100});
    }

    let bench_time = Date.now();

    for(let i=0;i<N_access;i+=N_concurrency)
    {
        /* constant key = only 1 database-read should happen */
        let values = await cache.setMultipleAwaitable(... pairs);    
    }

    console.log("run-time: "+(Date.now() - bench_time)+"ms");
    console.log("------------------------------------------");
}

Output:

run-time: 717ms
------------------------------------------
run-time: 662ms
------------------------------------------
run-time: 672ms
------------------------------------------
run-time: 662ms
------------------------------------------
run-time: 649ms
------------------------------------------
run-time: 645ms
------------------------------------------
run-time: 648ms
------------------------------------------
run-time: 649ms
------------------------------------------
run-time: 643ms
------------------------------------------

This is 1550 writes per millisecond.

~1.5 million reads/writes per second is fast enough to feed some open-world video-games with textures, geometries, any world data (as long as player does not go out of the cached content) or some artificial intelligence programs with thousands of training data that may not fit into RAM.

Peak Write-Cache-Miss Performance Test

To test write-cache-miss performance, 100% unique key access is made by benchmark app. All accesses go through backing-store. In other words, 0% cache hit.

  • Number of reads: 1million
  • Concurrency: 5000
  • Cache-size: 1000 elements
  • Access: unique key to force cache-write-miss on all except first access
JavaScript
"use strict";

let N_bench_repeats = 10;
let N_access =      1000000;
let N_concurrency = 5000;

let Lru = require("./lrucache.js").Lru;

let cache_elements = 1000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){

    // simulated slow access
    setTimeout(function(){
        callback(Math.random() /* database[key] */);
    },10);
   
},cache_element_expire_ms, async function(key,value,callback){
    // simulated slow access
    setTimeout(function(){
        // database[key]=value;
        callback();
    },10);
});

let unique_id = 0;
async function benchmark(){

    let pairs = [];
    for(let j=0;j<N_concurrency;j++)
    {
        pairs.push({key:unique_id++, value:Math.random()});
    }

    let bench_time = Date.now();

    for(let i=0;i<N_access;i+=N_concurrency)
    {
        /* constant key = only 1 database-read should happen */
        let values = await cache.setMultipleAwaitable(... pairs);    
    }

    console.log("run-time: "+(Date.now() - bench_time)+"ms");
    console.log("------------------------------------------");
}

function repeat(){
    N_bench_repeats--;
    if(N_bench_repeats>0)
        benchmark().then(repeat);
}

repeat();

10 millisecond backing-store latency:

run-time: 17768ms
------------------------------------------
run-time: 17540ms
------------------------------------------
run-time: 17753ms
------------------------------------------

1 millisecond backing-store latency:

run-time: 6092ms
------------------------------------------
run-time: 5786ms
------------------------------------------
run-time: 5656ms
------------------------------------------
run-time: 5660ms
------------------------------------------

Zero-latency backing-store (assuming it can still serve data concurrently):

run-time: 6010ms
------------------------------------------
run-time: 5683ms
------------------------------------------
run-time: 5704ms
------------------------------------------
run-time: 5663ms
------------------------------------------

Synchronous backing-store (removing the setTimeout in the cache miss parameters):

run-time: 1348ms
------------------------------------------
run-time: 1186ms
------------------------------------------
run-time: 1181ms
------------------------------------------
run-time: 1177ms
------------------------------------------
run-time: 1175ms
------------------------------------------
run-time: 1171ms
------------------------------------------

Write-cache-miss performance depends more on backing-store performance and its peak value is 1171 milliseconds per 1000000 operations or 853 cache-misses per millisecond.

Peak Read-Cache-Miss Performance Test

JavaScript
"use strict";

let N_bench_repeats = 10;
let N_access =      1000000;
let N_concurrency = 5000;

let Lru = require("./lrucache.js").Lru;

let cache_elements = 1000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){

    // simulated slow access
    setTimeout(function(){
        callback(Math.random() /* database[key] */);
    },10);
   
},cache_element_expire_ms, async function(key,value,callback){
    // simulated slow access
    setTimeout(function(){
        // database[key]=value;
        callback();
    },10);
});

let unique_id = 0;
async function benchmark(){

    let keys = [];
    for(let j=0;j<N_concurrency;j++)
    {
        keys.push(unique_id++);
    }

    let bench_time = Date.now();

    for(let i=0;i<N_access;i+=N_concurrency)
    {
        /* constant key = only 1 database-read should happen */
        let values = await cache.getMultipleAwaitable(... keys);    
    }

    console.log("run-time: "+(Date.now() - bench_time)+"ms");
    console.log("------------------------------------------");
}

function repeat(){
    N_bench_repeats--;
    if(N_bench_repeats>0)
        benchmark().then(repeat);
}

repeat();

10 millisecond backing-store latency:

run-time: 18255ms
------------------------------------------
run-time: 18033ms
------------------------------------------
run-time: 18248ms
------------------------------------------

1 millisecond backing-store latency:

run-time: 6039ms
------------------------------------------
run-time: 5759ms
------------------------------------------
run-time: 5786ms
------------------------------------------

Zero latency backing-store:

run-time: 5998ms
------------------------------------------
run-time: 5789ms
------------------------------------------
run-time: 5818ms
------------------------------------------

Synchronous backing-store (removing the setTimeout in the cache miss parameters):

run-time: 1357ms
------------------------------------------
run-time: 1209ms
------------------------------------------
run-time: 1191ms
------------------------------------------

Peak cache-read-miss performance is 839 reads per millisecond.

When backing-store latency is 10 milliseconds, 1 million operations make a total latency of ~ 3 hours. Asynchronous caching lowered 3 hours of serial reading session time down to 18 seconds. Hiding ratio is 3 hours / 18 seconds = 600

Despite setting the concurrency to 5000, the effective concurrency is 600 for this specific benchmark code (with setTimeout = 10 millisecond setting). This could be a limitation of NodeJs (v14) or operating system or something else (1 channel memory = only 9-10 GB/s ).

Random Read+Write Combined Performance Test

  • Number of reads: 1million, on a 1 million length plain Js array
  • Number of writes: 1million, on same array using same cache
  • Concurrency: 5000 x2
  • Cache-size: 500k elements
  • Access: random keys between 0 and 999999 to force 50% cache hit&miss on both reads and writes
JavaScript
"use strict";

let N_bench_repeats = 10;
let N_access =      1000000;
let N_concurrency = 5000;
let backing_store = {};
let cache_read_miss = 0;
let cache_write_miss = 0;
for(let i=0;i<N_access;i++)
{
    backing_store[i]=i;
}

let Lru = require("./lrucache.js").Lru;

let cache_elements = 500000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){
    // simulated slow access
        cache_read_miss++;
        callback(backing_store[key]);
   
},cache_element_expire_ms, async function(key,value,callback){
    // simulated slow access
        cache_write_miss++;
        backing_store[key]=value;
        callback();
});

async function benchmark(){
    cache_read_miss = 0;
    cache_write_miss = 0;

    let bench_time = Date.now();

    for(let i=0;i<N_access;i+=N_concurrency)
    {
        let ctr = 0;
        let res = new Promise((success,fail)=>{
                for(let j=0;j<N_concurrency;j++)
                {
                    cache.get(Math.floor(Math.random()*N_access),               
                          function(data){ ctr++; if(ctr==N_concurrency) success(1);});
                    cache.set(Math.floor(Math.random()*N_access), Math.random(),
                          function(data){ ctr++; if(ctr==N_concurrency) success(1);});
                }
            });
        res = await res;        
    }        

    console.log("run-time: "+(Date.now() - bench_time)+"ms");
    console.log("cache-read-hit-ratio: "+
               (100*(N_access - cache_read_miss)/(N_access))+"%");
    console.log("cache-write-hit-ratio: "+
               (100*(N_access - cache_write_miss)/(N_access))+"%");
    console.log("total-hit-ratio: "+
       (100*(2*N_access - cache_write_miss - cache_read_miss)/(2*N_access))+"%");
    console.log("------------------------------------------");
}

function repeat(){
    N_bench_repeats--;
    if(N_bench_repeats>0)
        benchmark().then(repeat);
}

repeat();

Output:

>
run-time: 4392ms
cache-read-hit-ratio: 42.3823%
cache-write-hit-ratio: 59.5218%
total-hit-ratio: 50.95205%
------------------------------------------
run-time: 3712ms
cache-read-hit-ratio: 49.9964%
cache-write-hit-ratio: 33.3312%
total-hit-ratio: 41.6638%
------------------------------------------
run-time: 3909ms
cache-read-hit-ratio: 49.9807%
cache-write-hit-ratio: 33.296%
total-hit-ratio: 41.63835%
------------------------------------------
run-time: 4123ms
cache-read-hit-ratio: 49.9926%
cache-write-hit-ratio: 33.2964%
total-hit-ratio: 41.6445%
------------------------------------------
run-time: 3943ms
cache-read-hit-ratio: 50.0449%
cache-write-hit-ratio: 33.3457%
total-hit-ratio: 41.6953%
------------------------------------------
run-time: 4102ms
cache-read-hit-ratio: 49.9443%
cache-write-hit-ratio: 33.3026%
total-hit-ratio: 41.62345%
------------------------------------------
run-time: 3816ms
cache-read-hit-ratio: 50.0713%
cache-write-hit-ratio: 33.2516%
total-hit-ratio: 41.66145%
------------------------------------------

3816 milliseconds to do 2million operations = 524 operations per millisecond. This is the performance of mixed reads and writes in same loop when cache size is half the size of the dataset. Read operations can still force eviction of edited pages (and causes write-miss) but write operations do not cause any extra cache-read-miss, that's why cache-write-ratio decreases.

Cache size = 800k (80% of the dataset):

run-time: 3724ms
cache-read-hit-ratio: 56.1437%
cache-write-hit-ratio: 95.1107%
total-hit-ratio: 75.6272%
------------------------------------------
run-time: 3143ms
cache-read-hit-ratio: 79.9688%
cache-write-hit-ratio: 70.1968%
total-hit-ratio: 75.0828%
------------------------------------------
run-time: 3406ms
cache-read-hit-ratio: 79.9668%
cache-write-hit-ratio: 66.9695%
total-hit-ratio: 73.46815%
------------------------------------------
run-time: 3078ms
cache-read-hit-ratio: 80.0162%
cache-write-hit-ratio: 66.612%
total-hit-ratio: 73.3141%
------------------------------------------
run-time: 3379ms
cache-read-hit-ratio: 80.0395%
cache-write-hit-ratio: 66.5492%
total-hit-ratio: 73.29435%
------------------------------------------
run-time: 3536ms
cache-read-hit-ratio: 79.9812%
cache-write-hit-ratio: 66.5147%
total-hit-ratio: 73.24795%
------------------------------------------
run-time: 3089ms
cache-read-hit-ratio: 79.9878%
cache-write-hit-ratio: 66.5399%
total-hit-ratio: 73.26385%
------------------------------------------

Cache size = 990k (99% of the dataset):

run-time: 4558ms
cache-read-hit-ratio: 56.7343%
cache-write-hit-ratio: 100%
total-hit-ratio: 78.36715%
------------------------------------------
run-time: 2916ms
cache-read-hit-ratio: 94.1668%
cache-write-hit-ratio: 100%
total-hit-ratio: 97.0834%
------------------------------------------
run-time: 2757ms
cache-read-hit-ratio: 98.8937%
cache-write-hit-ratio: 98.9428%
total-hit-ratio: 98.91825%
------------------------------------------
run-time: 2708ms
cache-read-hit-ratio: 98.9939%
cache-write-hit-ratio: 98.1394%
total-hit-ratio: 98.56665%
------------------------------------------
run-time: 2749ms
cache-read-hit-ratio: 99.0013%
cache-write-hit-ratio: 98.1331%
total-hit-ratio: 98.5672%
------------------------------------------

Sequential Write+Read Combined Performance Test

  • Number of reads: 1million, on a 1 million length plain Js array
  • Number of writes: 1million, on same array using same cache and same key but just before the read
  • Concurrency: 5000 x2
  • Cache-size: 500k elements
  • Access: sequential keys between 0 and 999999
JavaScript
"use strict";

let N_bench_repeats = 10;
let N_access =      1000000;
let N_concurrency = 5000;
let backing_store = {};
let cache_read_miss = 0;
let cache_write_miss = 0;
for(let i=0;i<N_access;i++)
{
    backing_store[i]=i;
}

let Lru = require("./lrucache.js").Lru;

let cache_elements = 500000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){
    // simulated slow access
        cache_read_miss++;
        callback(backing_store[key]);
   
},cache_element_expire_ms, async function(key,value,callback){
    // simulated slow access
        cache_write_miss++;
        backing_store[key]=value;
        callback();
});

async function benchmark(){
    cache_read_miss = 0;
    cache_write_miss = 0;

    let bench_time = Date.now();

    for(let i=0;i<N_access;i+=N_concurrency)
    {
        let ctr = 0;
        let res = new Promise((success,fail)=>{
                for(let j=0;j<N_concurrency;j++)
                {
                    cache.set(i+j, Math.random(),function(data)
                             { ctr++; if(ctr==N_concurrency) success(1);});
                    cache.get(i+j,               function(data)
                             { ctr++; if(ctr==N_concurrency) success(1);});
                }
            });
        res = await res;        
    }        

    console.log("run-time: "+(Date.now() - bench_time)+"ms");
    console.log("cache-read-hit-ratio: "+
               (100*(N_access - cache_read_miss)/(N_access))+"%");
    console.log("cache-write-hit-ratio: "+
               (100*(N_access - cache_write_miss)/(N_access))+"%");
    console.log("total-hit-ratio: "+
               (100*(2*N_access - cache_write_miss - cache_read_miss)/(2*N_access))+"%");
    console.log("------------------------------------------");
}

function repeat(){
    N_bench_repeats--;
    if(N_bench_repeats>0)
        benchmark().then(repeat);
}

repeat();

Output:

run-time: 3160ms
cache-read-hit-ratio: 100%
cache-write-hit-ratio: 50%
total-hit-ratio: 75%
------------------------------------------
run-time: 2862ms
cache-read-hit-ratio: 100%
cache-write-hit-ratio: 0%
total-hit-ratio: 50%
------------------------------------------
run-time: 2869ms
cache-read-hit-ratio: 100%
cache-write-hit-ratio: 0%
total-hit-ratio: 50%
------------------------------------------
run-time: 2820ms
cache-read-hit-ratio: 100%
cache-write-hit-ratio: 0%
total-hit-ratio: 50%

Read operations always benefited from write operations with same keys but every write after second repeat caused a write miss due to not having any data re-use for the writes.

Latest version source code of cache:

JavaScript
'use strict';
/*
cacheSize: number of elements in cache, constant, 
must be greater than or equal to number of asynchronous accessors / cache misses
callbackBackingStoreLoad: user-given cache-miss function to load data from datastore
elementLifeTimeMs: maximum miliseconds before an element is invalidated, 
only invalidated at next get() call with its key
reload: evicts all cache to reload new values from backing store
reloadKey: only evicts selected item (to reload its new value on next access)
*/

let Lru = function(cacheSize,callbackBackingStoreLoad,elementLifeTimeMs=1000,
                   callbackBackingStoreSave){
    const me = this;
    const aTypeGet = 0;
    const aTypeSet = 1;
    const maxWait = elementLifeTimeMs;
    const size = parseInt(cacheSize,10);
    const mapping = new Map();
    const mappingInFlightMiss = new Map();
    const bufData = new Array(size);
    const bufVisited = new Uint8Array(size);
    const bufEdited = new Uint8Array(size);
    const bufKey = new Array(size);
    const bufTime = new Float64Array(size);
    const bufLocked = new Uint8Array(size);
    for(let i=0;i<size;i++)
    {
        let rnd = Math.random();
        mapping.set(rnd,i);        
        bufData[i]="";
        bufVisited[i]=0;
        bufEdited[i]=0;
        bufKey[i]=rnd;
        bufTime[i]=0;
        bufLocked[i]=0;
    }
    let ctr = 0;
    let ctrEvict = parseInt(cacheSize/2,10);
    const loadData = callbackBackingStoreLoad;
    const saveData = callbackBackingStoreSave;
    let inFlightMissCtr = 0;
    // refresh all items time-span in cache
    this.reload=function(){
        for(let i=0;i<size;i++)
        {
            bufTime[i]=0;
        }
    };
    // refresh item time-span in cache by triggering eviction
    this.reloadKey=function(key){
        if(mapping.has(key))
        {
            bufTime[mapping[key]]=0;
        }
    };

    // get value by key
    this.get = function(keyPrm,callbackPrm){
        // aType=0: get
        access(keyPrm,callbackPrm,aTypeGet);
    };

    // set value by key (callback returns same value)
    this.set = function(keyPrm,valuePrm,callbackPrm){
        // aType=1: set
        access(keyPrm,callbackPrm,aTypeSet,valuePrm);
    };
    
    // aType=0: get
    // aType=1: set
    function access(keyPrm,callbackPrm,aType,valuePrm){
        
        const key = keyPrm;
        const callback = callbackPrm;
        const value = valuePrm;
        // stop dead-lock when many async get calls are made
        if(inFlightMissCtr>=size)
                 {
                       setTimeout(function(){
                // get/set
                access(key,function(newData){
                    callback(newData);
                },aType,value);
            },0);
                       return;
             }
        
        // if key is busy, then delay the request towards end of the cache-miss completion
        if(mappingInFlightMiss.has(key))
        {
            
            setTimeout(function(){
                // get/set
                access(key,function(newData){
                    callback(newData);
                },aType,value);
            },0);
            return;
        }

        if(mapping.has(key))
        {
            // slot is an element in the circular buffer of CLOCK algorithm
            let slot = mapping.get(key);

            // RAM speed data
            if((Date.now() - bufTime[slot]) > maxWait)
            {
                
                // if slot is locked by another operation, postpone the current operation
                if(bufLocked[slot])
                {                                        
                    setTimeout(function(){
                        access(key,function(newData){
                            callback(newData);
                        },aType,value);
                    },0);
                    
                }
                else // slot is not locked and its lifespan has ended
                {
                    // if it was edited, update the backing-store first
                    if(bufEdited[slot] == 1)
                    {
                        bufLocked[slot] = 1;
                        bufEdited[slot]=0;
                        mappingInFlightMiss.set(key,1); // lock key
                        inFlightMissCtr++;
                        // update backing-store, this is async
                        saveData(bufKey[slot],bufData[slot],function(){
                            mappingInFlightMiss.delete(key);    // unlock key
                            bufLocked[slot] = 0;
                            inFlightMissCtr--;

                            mapping.delete(key); // disable mapping for current key
                            
                            // re-simulate the access, async
                            access(key,function(newData){
                                callback(newData);
                            },aType,value);

                        });
                    }
                    else
                    {
                        mapping.delete(key); // disable mapping for current key
                        access(key,function(newData){
                            
                            callback(newData);
                        },aType,value);
                    }
                }
                
            }
            else    // slot life span has not ended
            {
                bufVisited[slot]=1;
                bufTime[slot] = Date.now();

                // if it is a "set" operation
                if(aType == aTypeSet)
                {    
                    bufEdited[slot] = 1; // later used when data needs to be 
                                         // written to data-store (write-cache feature)
                    bufData[slot] = value;
                }
                callback(bufData[slot]);
            }
        }
        else
        {
            // datastore loading + cache eviction
            let ctrFound = -1;
            let oldVal = 0;
            let oldKey = 0;
            while(ctrFound===-1)
            {
                // give slot a second chance before eviction
                if(!bufLocked[ctr] && bufVisited[ctr])
                {
                    bufVisited[ctr]=0;
                }
                ctr++;
                if(ctr >= size)
                {
                    ctr=0;
                }

                // eviction conditions
                if(!bufLocked[ctrEvict] && !bufVisited[ctrEvict])
                {
                    // eviction preparations, lock the slot
                    bufLocked[ctrEvict] = 1;
                    inFlightMissCtr++;
                    ctrFound = ctrEvict;
                    oldVal = bufData[ctrFound];
                    oldKey = bufKey[ctrFound];
                }

                ctrEvict++;
                if(ctrEvict >= size)
                {
                    ctrEvict=0;
                }
            }
            
            // user-requested key is now asynchronously in-flight & 
            // locked for other operations
            mappingInFlightMiss.set(key,1);
            
            // eviction function. least recently used data is gone, 
            // newest recently used data is assigned
            let evict = function(res){

                mapping.delete(bufKey[ctrFound]);

                bufData[ctrFound]=res;
                bufVisited[ctrFound]=0;
                bufKey[ctrFound]=key;
                bufTime[ctrFound]=Date.now();
                bufLocked[ctrFound]=0;

                mapping.set(key,ctrFound);
                callback(res);
                inFlightMissCtr--;
                mappingInFlightMiss.delete(key);
            
            };

            // if old data was edited, send it to data-store first, then fetch new data
            if(bufEdited[ctrFound] == 1)
            {
                if(aType == aTypeGet)
                    bufEdited[ctrFound] = 0;

                // old edited data is sent back to data-store
                saveData(oldKey,oldVal,function(){
                    if(aType == aTypeGet)
                        loadData(key,evict);
                    else if(aType == aTypeSet)
                        evict(value);
                });
            }
            else
            {
                if(aType == aTypeSet)
                    bufEdited[ctrFound] = 1;
                if(aType == aTypeGet)
                    loadData(key,evict);
                else if(aType == aTypeSet)
                    evict(value);    
            }
        }
    };

    this.getAwaitable = function(key){
        return new Promise(function(success,fail){
            me.get(key,function(data){
                success(data);
            });
        });
    }

    this.setAwaitable = function(key,value){
        return new Promise(function(success,fail){
            me.set(key,value,function(data){
                success(data);
            });
        });
    }

    // as many keys as required can be given, separated by commas
    this.getMultiple = function(callback, ... keys){
        let result = [];
        let ctr1 = keys.length;
        for(let i=0;i<ctr1;i++)
            result.push(0);
        let ctr2 = 0;
        keys.forEach(function(key){
            let ctr3 = ctr2++;
            me.get(key,function(data){
                result[ctr3] = data;
                ctr1--;
                if(ctr1==0)
                {
                    callback(result);
                }
            });
        });
    };

    // as many key-value pairs ( in form of { key:foo, value:bar } ) can be given, 
    // separated by commas
    this.setMultiple = function(callback, ... keyValuePairs){
        let result = [];
        let ctr1 = keyValuePairs.length;
        for(let i=0;i<ctr1;i++)
            result.push(0);
        let ctr2 = 0;
        keyValuePairs.forEach(function(pair){
            let ctr3 = ctr2++;
            me.set(pair.key,pair.value,function(data){
                result[ctr3] = data;
                ctr1--;
                if(ctr1==0)
                {
                    callback(result);
                }
            });
        });
    };

    // as many keys as required can be given, separated by commas
    this.getMultipleAwaitable = function(... keys){
        return new Promise(function(success,fail){
            me.getMultiple(function(results){
                success(results);
            }, ... keys);
        });
    };

    // as many key-value pairs ( in form of { key:foo, value:bar } ) can be given, 
    // separated by commas
    this.setMultipleAwaitable = function(... keyValuePairs){
        return new Promise(function(success,fail){
            me.setMultiple(function(results){
                success(results);
            }, ... keyValuePairs);
        });
    };

    // push all edited slots to backing-store and reset all slots 
    // lifetime to "out of date"
    this.flush = function(callback){

        function waitForReadWrite(callbackW){

            // if there are in-flight cache-misses cache-write-misses 
            // or active slot locks, then wait
            if(mappingInFlightMiss.size > 0 || 
               bufLocked.reduce((e1,e2)=>{return e1+e2;}) > 0)
            {
                setTimeout(()=>{ waitForReadWrite(callbackW); },10);
            }
            else
                callbackW();
        }
        waitForReadWrite(async function(){  
            for(let i=0;i<size;i++)
            {
                bufTime[i]=0;
                if(bufEdited[i] == 1)
                {        
                    // less concurrency pressure, less failure
                    await me.setAwaitable(bufKey[i],bufData[i]);
                }
            }
            callback(); // flush complete
        });
    };
};

exports.Lru = Lru;

Cache Hit Ratio By Image Softening Sample (From Github)

It applies 5-point smoothing kernel on a 8x8 image using a cache size of 25 and writes every result pixel onto an output buffer through another cache of same size. Then outputs cache hit ratios. Since every other row of pixels use the rows before them (or every pixel uses closes 4 neighboring pixels), there is some data-reuse ratio bigger than 1. This pattern allows just 25 elements of cache effectively cache ~75% of the 384 read/write accesses.

Output:

run-time: 5704 ms          320 reads 64 writes
cache read hit ratio=72.91666666666667
cache write hit ratio=77.08333333333333

History

  • 8th April, 2021
    • Basic coverage of algorithm, some samples
  • 19th September, 2021
    • Added item-refresh capability (as a postponed "write" op) and optimized asynchronity by solving the deadlock problem
  • 25th September, 2021
    • Added method for multiple key-value request to avoid callback-hell and some benchmarking results
  • 29th September, 2021
    • Added write-cache feature, different optimizations and new benchmarks
    • Updated title with "asynchronous" word
    • Changed tags to a better set

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)