This is about implementation details of LRU - CLOCK algorithm and how it performs under different key access patterns.
Introduction
Caching is one of the most important optimizations for maintaining high performance in a wide range of applications. In server-side JavaScript (web apps), caching of files and video blobs and fast accession to page templates are vital. OSes (like Ubuntu) have their own file caching already but when there are multiple web apps running on the same server and if one app is streaming a lot of video data, then other apps cache contents may be invalidated frequently and their performance drop.
A dedicated Least Recently Used (cache eviction) algorithm for an app's resources make the app independent of the cache pollution from other applications in the same server.
Background
The basic idea of caching is to serve some (computationally) expensive data from an already (precomputed) list of data. Without caching, the throughput depends on the slowest storage that the data comes from. In case of a file, it is HDD. HDD is too slow ~100MB/s, SSD is still slow ~500MB/s, even NVME is slower (5GB/s) than the cheapest RAM module on the shelf. Caching of file contents on RAM improves not only throughput but also latency. In a web app, especially with NodeJS, the latency of an operation (if in a non-async function) decides when other clients will be served. So, having a LRU cache should improve performance by 2x - 100x depending on the slowest data store performance. Having an asynchronous LRU on the other hand, should hide any latency of a cache-miss.
What are cache miss and cache hit and cache hit ratio? Cache miss is the operation of getting data from slow datastore and assigning it to a victim slot in the cache in both cache-read and cache-write operations. Writing to backing-store is called "write-back". This article starts with simplest form of read-only cache and later adds write-cache capability and optimizations. LRU means least recently used slow is evicted for fresh data. Cache hit is the existence of a key in cache (and in RAM) for a quick accession. Cache hit ratio is the ratio of cache hits to the total (miss+hit) accesses.
Using the Code
First, a single module file for the LRU object constructor definition (latest version: https://github.com/tugrul512bit/LruJS)
(This is the initial version of source code, only contains basic read-caching and focuses on simplicity.)
'use strict';
let Lru = function(cacheSize,callbackBackingStoreLoad,elementLifeTimeMs=1000){
let me = this;
let maxWait = elementLifeTimeMs;
let size = parseInt(cacheSize,10);
let mapping = {};
let mappingInFlightMiss = {};
let buf = [];
for(let i=0;i<size;i++)
{
let rnd = Math.random();
mapping[rnd] = i;
buf.push({data:"",visited:false, key:rnd, time:0, locked:false});
}
let ctr = 0;
let ctrEvict = parseInt(cacheSize/2,10);
let loadData = callbackBackingStoreLoad;
this.get = function(key,callbackPrm){
let callback = callbackPrm;
if(key in mappingInFlightMiss)
{
setTimeout(function(){
me.get(key,function(newData){
callback(newData);
});
},0);
return;
}
if(key in mapping)
{
if((Date.now() - buf[mapping[key]].time) > maxWait)
{
if(buf[mapping[key]].locked)
{
setTimeout(function(){
me.get(key,function(newData){
callback(newData);
});
},0);
}
else
{
delete mapping[key];
me.get(key,function(newData){
callback(newData);
});
}
}
else
{
buf[mapping[key]].visited=true;
buf[mapping[key]].time = Date.now();
callback(buf[mapping[key]].data);
}
}
else
{
let ctrFound = -1;
while(ctrFound===-1)
{
if(!buf[ctr].locked && buf[ctr].visited)
{
buf[ctr].visited=false;
}
ctr++;
if(ctr >= size)
{
ctr=0;
}
if(!buf[ctrEvict].locked && !buf[ctrEvict].visited)
{
buf[ctrEvict].locked = true;
ctrFound = ctrEvict;
}
ctrEvict++;
if(ctrEvict >= size)
{
ctrEvict=0;
}
}
mappingInFlightMiss[key]=true;
let f = function(res){
delete mapping[buf[ctrFound].key];
buf[ctrFound] =
{data: res, visited:false, key:key, time:Date.now(), locked:false};
mapping[key] = ctrFound;
callback(buf[ctrFound].data);
delete mappingInFlightMiss[key];
};
loadData(key,f);
}
};
};
exports.Lru = Lru;
File caching example:
let Lru = require("./lrucache.js").Lru;
let fs = require("fs");
let path = require("path");
let fileCache = new Lru(500, async function(key,callback){
fs.readFile(path.join(__dirname,key),function(err,data){
if(err) {
callback({stat:404, data:JSON.stringify(err)});
}
else
{
callback({stat:200, data:data});
}
});
},1000 );
The Lru constructor gets parameters (cache size, a cache-miss callback with its own key and callback, cache element lifetime) to construct the CLOCK cache.
The async cache-miss callback works like this:
- some
get()
call does not find the key in cache - algorithm finds a victim slot
- runs this callback
- inside the callback, the expensive computation is done, asynchronously
- at the end of callback, the callback of callback is called to return data back to LRU cache
- next time same key is accessed, its data comes from RAM
There is only a get()
method for this implementation to maintain simplicity.
fileCache.get("./test.js",function(dat){
httpResponse.writeHead(dat.stat);
httpResponse.end(dat.data);
});
Since there is another callback for the result data, it is also possible to run as asynchronous.
How Does It Work?
- In most popular LRU implementations, there is always a "mapping" object from keys to cache slots, to achieve O(1) key-search time complexity (in terms of number of cache slots). This one is not different. In fact, JavaScript makes it very easy to do:
Mapping object:
let mapping = {};
Finding a (string/integer) key in mapping:
if(key in mapping)
{
}
It is efficient (O(1)) and simple (at least much simpler than C++ version).
- Once mapping leads to a cache slot, it is straightforward to get the data out of it:
buf[mapping[key]].visited=true;
buf[mapping[key]].time = Date.now();
callback(buf[mapping[key]].data);
The visited
flag is to inform CLOCK hands (ctr
and ctrEvict
) to save this slot from possible eviction. The time
field is used for life-time management of slots. Every access (with a cache-hit) renews the time
field so that it can stay in cache.
The callback
function is the user-given function for get()
method to retrieve the data of cache slot.
- Before getting data from mapped slot directly, its lifetime is measured. If it's over, it is invalidated by deleting its mapping and retrying with same key (that makes a cache-miss):
if((Date.now() - buf[mapping[key]].time) > maxWait)
{
delete mapping[key];
me.get(key,function(newData){
callback(newData);
});
}
Since the mapping is deleted, no other asynchronous access interferes with its internal state.
- If key is not found in mapping object, the LRU eviction (actually its approximation, CLOCK with 2 hands (aka Second Chance)) logic runs. The search for a victim is easy:
let ctrFound = -1;
while(ctrFound===-1)
{
if(!buf[ctr].locked && buf[ctr].visited)
{
buf[ctr].visited=false;
}
ctr++;
if(ctr >= size)
{
ctr=0;
}
if(!buf[ctrEvict].locked && !buf[ctrEvict].visited)
{
buf[ctrEvict].locked = true;
ctrFound = ctrEvict;
}
ctrEvict++;
if(ctrEvict >= size)
{
ctrEvict=0;
}
}
The first "if
" block checks if a slot pointed by "second chance" hand (ctr
) is not locked and is visited. If it is, then as a second chance, it does not evict but marks it as non-visited.
The third "if
" block checks if a slot pointed by "eviction" hand (ctrEvict
) is not locked and is not visited. If it is, then the slot is marked as "locked
" (for protection against asynchronous accesses to get()
method) and the eviction slot is found and the loop ends.
Observe that the two hands ctr
and ctrEvict
are initialized with 50% phase difference:
let ctr = 0;
let ctrEvict = parseInt(cacheSize/2,10);
and in the "while
" loop, they are incremented equally. This means that one always follows the other circularly, to constantly check each other's results. The more cache slots, the better victim slot search. A key has at least N/2 clock hand movements to be saved from eviction.
- After the victim slot is found, mapping is deleted for future async collisions and recreated after the datastore loading:
mappingInFlightMiss[key]=true;
let f = function(res){
delete mapping[buf[ctrFound].key];
buf[ctrFound] = {data: res, visited:false, key:key, time:Date.now(), locked:false};
mapping[key] = ctrFound;
callback(buf[ctrFound].data);
delete mappingInFlightMiss[key];
};
loadData(key,f);
Since the user-given cache-miss datastore-loading function (loadData
) can be asynchronous, there can be at most N in-flight cache-misses for this cache. This is an optimization to hide up to N cache miss latencies. Hiding latency is important for having high throughput (especially in web apps for thousands of visitors per second). Having more than N asynchronous cache misses / accesses causes dead-lock, so a cache with 100 slots can serve up to 100 users asynchronously. Or it can be throttled to a lower value (M) than N and do computing in multiple(K) passes (where M x K = total accesses).
Since cache misses can be hidden and cache hits are RAM-speed, the overall performance looks like O(1) for both cache-miss and cache-hit. But in per-element access, there are possibly multiple clock hand iterations per access, especially when there are only few slots in cache. When number of slots is inceased, it approaches O(1).
Inside this loadData
callback, new slot data has its locked
field set to false
and this makes the slot available for other asynchronous accesses.
- If there is a cache hit and if the slot found is locked and if its life time is expired, the access operation is delayed with
setTimeout
with 0 time parameter (this is more CPU-friendlier than setImmediate) towards end of JavaScript's message queue. The probability of the locked operation(cache-miss) ending before the setTimeout
is 100% and in terms of time complexity, this still counts as O(1) with a bigger latency but it is hidden behind the awaited latency of locked operation.
if(buf[mapping[key]].locked)
{
setTimeout(function(){
me.get(key,function(newData){
callback(newData);
});
},0);
}
- Lastly, if a key is in in-flight cache-miss mapping, it is postponed to a later position of message queue by
setTimeout
:
if(key in mappingInFlightMiss)
{
setTimeout(function(){
me.get(key,function(newData){
callback(newData);
});
},0);
return;
}
This way, duplicating data is evaded.
Benchmarking
Asynchronous Cache-Miss Benchmark
"use strict";
let Lru = require("./lrucache.js").Lru;
let cache = new Lru(1000, async function(key,callback){
setTimeout(function(){
callback(key+" processed");
},1000);
},1000 );
let ctr = 0;
let t1 = Date.now();
for(let i=0;i<1000;i++)
{
cache.get(i,function(data){
console.log("data:"+data+" key:"+i);
if(i.toString()+" processed" !== data)
{
console.log("error: wrong key-data mapping.");
}
if(++ctr === 1000)
{
console.log("benchmark: "+(Date.now()-t1)+" miliseconds");
}
});
}
To not cause any deadlock, LRU size is chosen as 1000 or the for
loop is allowed to iterate for only 1000 times.
Output:
benchmark: 1127 miliseconds
Since each cache-miss has 1000 miliseconds latency, loading 1000 elements synchronously would cause 15 minutes but with the overlapped cache-misses it is much quicker. This is useful especially in I/O heavy workloads like streaming data from HDD or network.
Cache-Hit Ratio Benchmark
10% hit rate:
- key generation: random, 10000 different values possible
- 1000 slots
"use strict";
let Lru = require("./lrucache.js").Lru;
let cacheMiss = 0;
let cache = new Lru(1000, async function(key,callback){
cacheMiss++;
setTimeout(function(){
callback(key+" processed");
},100);
},100000000 );
let ctr = 0;
let t1 = Date.now();
let asynchronity = 500;
let benchRepeat = 100;
let access = 0;
function test()
{
ctr = 0;
for(let i=0;i<asynchronity;i++)
{
let key = parseInt(Math.random()*10000,10);
cache.get(key.toString(),function(data){
access++;
if(key.toString()+" processed" !== data)
{
console.log("error: wrong key-data mapping.");
}
if(++ctr === asynchronity)
{
console.log("benchmark: "+(Date.now()-t1)+" miliseconds");
console.log("cache hit: "+(access - cacheMiss));
console.log("cache miss: "+(cacheMiss));
console.log("cache hit ratio: "+((access - cacheMiss)/access));
if(benchRepeat>0)
{
benchRepeat--;
test();
}
}
});
}
}
test();
Result:
benchmark: 10498 miliseconds
cache hit: 6151
cache miss: 44349
cache hit ratio: 0.1218019801980198
Since benchmark is made in 100 steps with 100 ms latency per cache-miss, it resulted 10 seconds (close to 100 x 100 milliseconds). Hit ratio is close to expected value of 10%.
50% Hit Ratio Test
let key = parseInt(Math.random()*2000,10);
Result:
benchmark: 10418 miliseconds
cache hit: 27541
cache miss: 22959
cache hit ratio: 0.5453663366336634
Again, close to the expected 50% ratio.
99% Hit Ratio Test
let key = parseInt(Math.random()*1010,10);
Result:
benchmark: 10199 miliseconds
cache hit: 49156
cache miss: 1344
cache hit ratio: 0.9733861386138614
Randomness of keys causing the 0.9733 ratio.
100% Hit Ratio Test
let key = parseInt(Math.random()*999,10);
Result:
benchmark: 1463 miliseconds
cache hit: 49501
cache miss: 999
cache hit ratio: 0.9802178217821782
After the first step of benchmark (which cannot evade cache misses), everything came from RAM and reduced the total latency a lot.
Points of Interest
It is a bit hard to see bugs in asynchronous codes. To overcome this issue, I had to use a debugger object that recorded all slot values in every function call with their respective Date.now()
time and then sort the record array on the time field of each element and look at the path of every key access. This is a bit exhausting but is easier than just looking at the screen and following N paths just by eye.
Edit: With "update
" method, cache items can be re-fetched from backing store in case of a "write
" operation. Latest version (~160 lines of code) with some optimization on asynchronity ( and solves deadlock problem):
'use strict';
let Lru = function(cacheSize,callbackBackingStoreLoad,elementLifeTimeMs=1000){
const me = this;
const maxWait = elementLifeTimeMs;
const size = parseInt(cacheSize,10);
const mapping = {};
const mappingInFlightMiss = {};
const bufData = new Array(size);
const bufVisited = new Uint8Array(size);
const bufKey = new Array(size);
const bufTime = new Float64Array(size);
const bufLocked = new Uint8Array(size);
for(let i=0;i<size;i++)
{
let rnd = Math.random();
mapping[rnd] = i;
bufData[i]="";
bufVisited[i]=0;
bufKey[i]=rnd;
bufTime[i]=0;
bufLocked[i]=0;
}
let ctr = 0;
let ctrEvict = parseInt(cacheSize/2,10);
const loadData = callbackBackingStoreLoad;
let inFlightMissCtr = 0;
this.reload=function(){
for(let i=0;i<size;i++)
{
bufTime[i]=0;
}
};
this.reloadKey=function(key){
if(key in mapping)
{
bufTime[mapping[key]]=0;
}
};
this.get = function(keyPrm,callbackPrm){
const key = keyPrm;
const callback = callbackPrm;
if(inFlightMissCtr>=size)
{
setTimeout(function(){
me.get(key,function(newData){
callback(newData);
});
},0);
return;
}
if(key in mappingInFlightMiss)
{
setTimeout(function(){
me.get(key,function(newData){
callback(newData);
});
},0);
return;
}
if(key in mapping)
{
let slot = mapping[key];
if((Date.now() - bufTime[slot]) > maxWait)
{
if(bufLocked[slot])
{
setTimeout(function(){
me.get(key,function(newData){
callback(newData);
});
},0);
}
else
{
delete mapping[key];
me.get(key,function(newData){
callback(newData);
});
}
}
else
{
bufVisited[slot]=1;
bufTime[slot] = Date.now();
callback(bufData[slot]);
}
}
else
{
let ctrFound = -1;
while(ctrFound===-1)
{
if(!bufLocked[ctr] && bufVisited[ctr])
{
bufVisited[ctr]=0;
}
ctr++;
if(ctr >= size)
{
ctr=0;
}
if(!bufLocked[ctrEvict] && !bufVisited[ctrEvict])
{
bufLocked[ctrEvict] = 1;
inFlightMissCtr++;
ctrFound = ctrEvict;
}
ctrEvict++;
if(ctrEvict >= size)
{
ctrEvict=0;
}
}
mappingInFlightMiss[key]=1;
let f = function(res){
delete mapping[bufKey[ctrFound]];
bufData[ctrFound]=res;
bufVisited[ctrFound]=0;
bufKey[ctrFound]=key;
bufTime[ctrFound]=Date.now();
bufLocked[ctrFound]=0;
mapping[key] = ctrFound;
callback(bufData[ctrFound]);
inFlightMissCtr--;
delete mappingInFlightMiss[key];
};
loadData(key,f);
}
};
};
exports.Lru = Lru;
With this version, dead lock will not occur forever but only until the number of in-flight-item fetchings is reduced to N. Refreshing an item:
cache.reloadKey(someKey);
does not immediately write the new value to cache. It only occurs when the item is requested by get
method and in asynchronously to all other cache-misses in-flight. This makes better latency-hiding than refreshing on demand but overall latency per item may increase due to postponing of refreshing to next "get
" access.
What About Callback-Hell?
Since this is a callback based cache, requesting multiple datas at once in a common synchronization point would cause a code-base like this:
cache.get(key1, function(data1){
cache.get(key2, function(data2){
cache.get(key3, function(data3)){
...
};
});
});
It becomes even worse when a pixel value (image processing app) requires neighboring 25 pixel values. Even minifying the code would look pretty compared to this. Solution to this problem is simple:
this.getMultiple = function(callback, ... keys){
let result = [];
let ctr = keys.length;
for(let i=0;i<ctr;i++)
result.push(0);
let ctr2 = 0;
keys.forEach(function(key){
let ctr3=ctr2++;
me.get(key,function(data){
result[ctr3] = data;
ctr--;
if(ctr==0)
{
callback(result);
}
});
});
};
Here, variable-argument function is used to let user request as many values as needed by just coma-separated keys after the callback. Since there is no atomic operations or multiple threads like C++, it only requires to decrement a counter to check if all operations are complete.
Each operation gets a value for a key and increments the counter. Once it gets the last asynchronous data, the ctr
counter is decremented to zero. Then callback is called with the result array. Usage looks like this:
cache.getMultiple(function(dataArray){
console.log(dataArray);
},"a key",5,"another key",someKeyVariable,"my last key");
This can be further optimized but for the looks for users. Promise
s in JavaScript are good for this. They can be constructed as simple as this:
this.getAwaitable = function(key){
return new Promise( (success,fail)=>{
me.get(key, function(data){ success(data); });
});
};
Then it looks like this from user's space:
let mySynchronizedValue = await cache.getAwaitable(myKey);
but this breaks the aim of asnychronous caching. Await
makes the (single) thread of JavaScript running the app wait for the value and new cache-get operations cannot be launched in the same scope. To optimize a bit, user can write them like this:
let myAsync1 = cache.getAwaitable(myKey1);
let myAsync2 = cache.getAwaitable(myKey2);
let myAsync3 = cache.getAwaitable(myKey3);
let mySync1 = await myAsync1;
let mySync2 = await myAsync2;
let mySync3 = await myAsync3;
Also, usage (allocation / freeing resources) of Promise
s in a tight loop is slower than pure callbacks because Promise
s do more things than callbacks in the background and they are usually "await
"ed right after they are constructed (which decreases concurrency compared to the free-running callbacks). The following code is the latest source from github that contains more optimizations and features:
'use strict';
let Lru = function(cacheSize,callbackBackingStoreLoad,
elementLifeTimeMs=1000,callbackBackingStoreSave){
const me = this;
const aTypeGet = 0;
const aTypeSet = 1;
const maxWait = elementLifeTimeMs;
const size = parseInt(cacheSize,10);
const mapping = new Map();
const mappingInFlightMiss = new Map();
const bufData = new Array(size);
const bufVisited = new Uint8Array(size);
const bufEdited = new Uint8Array(size);
const bufKey = new Array(size);
const bufTime = new Float64Array(size);
const bufLocked = new Uint8Array(size);
for(let i=0;i<size;i++)
{
let rnd = Math.random();
mapping.set(rnd,i);
bufData[i]="";
bufVisited[i]=0;
bufEdited[i]=0;
bufKey[i]=rnd;
bufTime[i]=0;
bufLocked[i]=0;
}
let ctr = 0;
let ctrEvict = parseInt(cacheSize/2,10);
const loadData = callbackBackingStoreLoad;
const saveData = callbackBackingStoreSave;
let inFlightMissCtr = 0;
this.reload=function(){
for(let i=0;i<size;i++)
{
bufTime[i]=0;
}
};
this.reloadKey=function(key){
if(mapping.has(key))
{
bufTime[mapping[key]]=0;
}
};
this.get = function(keyPrm,callbackPrm){
access(keyPrm,callbackPrm,aTypeGet);
};
this.set = function(keyPrm,valuePrm,callbackPrm){
access(keyPrm,callbackPrm,aTypeSet,valuePrm);
};
function access(keyPrm,callbackPrm,aType,valuePrm){
const key = keyPrm;
const callback = callbackPrm;
const value = valuePrm;
if(inFlightMissCtr>=size)
{
setTimeout(function(){
access(key,function(newData){
callback(newData);
},aType,value);
},0);
return;
}
if(mappingInFlightMiss.has(key))
{
setTimeout(function(){
access(key,function(newData){
callback(newData);
},aType,value);
},0);
return;
}
if(mapping.has(key))
{
let slot = mapping.get(key);
if((Date.now() - bufTime[slot]) > maxWait)
{
if(bufLocked[slot])
{
setTimeout(function(){
access(key,function(newData){
callback(newData);
},aType,value);
},0);
}
else
{
if(bufEdited[slot] == 1)
{
bufLocked[slot] = 1;
bufEdited[slot]=0;
mappingInFlightMiss.set(key,1);
inFlightMissCtr++;
saveData(bufKey[slot],bufData[slot],function(){
mappingInFlightMiss.delete(key);
bufLocked[slot] = 0;
inFlightMissCtr--;
mapping.delete(key);
access(key,function(newData){
callback(newData);
},aType,value);
});
}
else
{
mapping.delete(key);
access(key,function(newData){
callback(newData);
},aType,value);
}
}
}
else
{
bufVisited[slot]=1;
bufTime[slot] = Date.now();
if(aType == aTypeSet)
{
bufEdited[slot] = 1;
bufData[slot] = value;
}
callback(bufData[slot]);
}
}
else
{
let ctrFound = -1;
let oldVal = 0;
let oldKey = 0;
while(ctrFound===-1)
{
if(!bufLocked[ctr] && bufVisited[ctr])
{
bufVisited[ctr]=0;
}
ctr++;
if(ctr >= size)
{
ctr=0;
}
if(!bufLocked[ctrEvict] && !bufVisited[ctrEvict])
{
bufLocked[ctrEvict] = 1;
inFlightMissCtr++;
ctrFound = ctrEvict;
oldVal = bufData[ctrFound];
oldKey = bufKey[ctrFound];
}
ctrEvict++;
if(ctrEvict >= size)
{
ctrEvict=0;
}
}
mappingInFlightMiss.set(key,1);
let evict = function(res){
mapping.delete(bufKey[ctrFound]);
bufData[ctrFound]=res;
bufVisited[ctrFound]=0;
bufKey[ctrFound]=key;
bufTime[ctrFound]=Date.now();
bufLocked[ctrFound]=0;
mapping.set(key,ctrFound);
callback(res);
inFlightMissCtr--;
mappingInFlightMiss.delete(key);
};
if(bufEdited[ctrFound] == 1)
{
if(aType == aTypeGet)
bufEdited[ctrFound] = 0;
saveData(oldKey,oldVal,function(){
if(aType == aTypeGet)
loadData(key,evict);
else if(aType == aTypeSet)
evict(value);
});
}
else
{
if(aType == aTypeSet)
bufEdited[ctrFound] = 1;
if(aType == aTypeGet)
loadData(key,evict);
else if(aType == aTypeSet)
evict(value);
}
}
};
this.getAwaitable = function(key){
return new Promise(function(success,fail){
me.get(key,function(data){
success(data);
});
});
}
this.setAwaitable = function(key,value){
return new Promise(function(success,fail){
me.set(key,value,function(data){
success(data);
});
});
}
this.getMultiple = function(callback, ... keys){
let result = [];
let ctr1 = keys.length;
for(let i=0;i<ctr1;i++)
result.push(0);
let ctr2 = 0;
keys.forEach(function(key){
let ctr3 = ctr2++;
me.get(key,function(data){
result[ctr3] = data;
ctr1--;
if(ctr1==0)
{
callback(result);
}
});
});
};
this.setMultiple = function(callback, ... keyValuePairs){
let result = [];
let ctr1 = keys.length;
for(let i=0;i<ctr1;i++)
result.push(0);
let ctr2 = 0;
keyValuePairs.forEach(function(pair){
let ctr3 = ctr2++;
me.set(pair.key,pair.value,function(data){
result[ctr3] = data;
ctr1--;
if(ctr1==0)
{
callback(result);
}
});
});
};
this.getMultipleAwaitable = function(... keys){
return new Promise(function(success,fail){
me.getMultiple(function(results){
success(results);
}, ... keys);
});
};
this.setMultipleAwaitable = function(... keyValuePairs){
return new Promise(function(success,fail){
me.setMultiple(function(results){
success(results);
}, ... keyValuePairs);
});
};
this.flush = function(callback){
let ctr1 = 0;
function waitForReadWrite(callbackW){
if(mappingInFlightMiss.size > 0 ||
bufLocked.reduce((e1,e2)=>{return e1+e2;}) > 0)
{
setTimeout(()=>{ waitForReadWrite(callbackW); },0);
}
else
callbackW();
}
waitForReadWrite(function(){
for(let i=0;i<size;i++)
{
bufTime[i]=0;
if(bufEdited[i] == 1)
ctr1++;
}
for(let i=0;i<size;i++)
{
if(bufEdited[i] == 1)
{
me.set(bufKey[i],bufData[i],function(val){
ctr1--;
if(ctr1 == 0)
{
callback();
}
});
}
}
});
};
};
exports.Lru = Lru;
The following benchmark results are from an old version where mapping was maintained by simple { } objects.
With fully asynchronous value requests on thousands of files, file chunks, http requests and other tasks, the performance disadvantage of JavaScript language is minimized.
More Optimizations
After making the cache work right, more optimizations can be made. One of them is not using simple objects for mapping things. (Last version of complete source code is at the end of the article after benchmarking part.)
let mapping = {};
Because this stops a program from entering "__proto__"
or similar JavaScript-specific string
s as keys. Instead of fine-tuning the basic object against these failures, it should be handled by the data-structure that was made for this.
let mapping = new Map();
Since Map
is implemented as either a hashtable or red-black tree, its get
/set
/has
/size
methods are optimized for amortized O(1) time complexity. When using a Map
, it is important to not mix it with normal object usage. One may accidentally do this:
let mapping = new Map();
mapping[aKey] = aValue;
this doesn't use its optimized methods. They must be directly used like this:
let mapping = new Map();
mapping.set(aKey,aValue));
mapping.get(aKey);
another speciality of Map
is having a higher max cache size due to being more efficient on memory footprint than simple objects. Ability to hold more cache items makes it useful for different kinds of jobs too. Managing a medium sized database is possible with this.
Even though the Map
is better for memory efficiency, the number of in-flight get
s/set
s increases temporary memory consumption. With a cache size of 25 elements (32bit integer keys, 64bit floating point values) and concurrency of 5, NodeJs app consumes up to 45 MB for processing 400x400 image data. When concurrency is 800000, memory consumption starts at 105MB and grows to 225MB, despite having only 25 elements in cache because 800k items are waiting in queue to reach 25 slots of cache. Allocating 1-million items(cache slots) for cache and having 5 concurrency (5 pixels requested at a time), memory consumption is 140MB-280MB depending on garbage-collector activity. When concurrency is increased to 800000 (all loop iterations made asynchronous), requirement increases to 370 MB. Cache-miss functions in-flight cause more memory requirement but is negligible when concurrency is not too high.
1 million element cache requiring at least 140MB is equivalent to 140 bytes per cache-item. This is the cost of book-keeping for asynchronous operations to hide multiple backing-store latencies (for increased performance / ease of use) while staying at thousands of operations per millisecond throughput.
To compare with some in-memory databases that have their own caching:
- Redis 1million small items cost 85MB memory
- Hazelcast 1million items cost at least 50MB memory
To compare with other Github repositories of LRU cache algorithms:
After exchanging plain objects for Map
s, cache's construction part starts with this:
const mapping = new Map();
const mappingInFlightMiss = new Map();
const bufData = new Array(size);
const bufVisited = new Uint8Array(size);
const bufEdited = new Uint8Array(size);
const bufKey = new Array(size);
const bufTime = new Float64Array(size);
const bufLocked = new Uint8Array(size);
In the first version, there was only a buf
generic array to hold all status info of all cache slots. But here, all variables of a "cache slot" (circular buffer element) are separated to stop unnecessarily loading a whole slot everytime a 8bit value is required. This helps low-end systems keep minimum performance at a higher point and makes CPU-cache less polluted which is good for all kinds of systems. Test system has only 9-10 GB/s memory bandwidth and benefits from this optimization.
Write-Caching
Next important thing is to offload "write" operations onto cache. Before this, if a user had to update the backing-store, this was needed to be done directly (in a slow way) first, then a call to reloadKey()
on the cache had to be done to get latest information by get()
method. On a product gallery web-site, it is not that important as the product information / user information / web page contents / attached images are not updated as frequently as clients visiting the web-site. But once the data update is required to be performant, caching is a good option. Write-cache can be useful in many applications from forums to open-world games. Especially in open-world games, when player changes world data, it needs to be remembered when player visits same place again, but without overflowing RAM and faster than HDD.
Adding a "set
" method to the cache not only reduces extra API calls like reloadKey()
but also improves performance of writes similar to the reads because it uses same LRU eviction with only few changes to the code. To add the write-cache feature into the algorithm, an extra status field called "bufEdited
" is required.
The "edited" status of a cache slot is changed to 1
whenever data is written to it. It is set to 0
when the data is updated on the backing-store. Edited is equivalent to "needing an update on backing-store" or in other words, the "dirty bit".
Since both write and read operations run on same eviction algorithm, there is no need to duplicate eviction-specific codes. The get()/set()
methods become wrappers of a hidden access()
method. Access method takes more parameters to select get or set algorithm:
function access(keyPrm,callbackPrm,aType,valuePrm){ .. }
The aType
parameter selects the operation type. If its aTypeGet
(0), then it runs a get operation, otherwise (aTypeSet
) it is a set operation. Then there is valuePrm
parameter for the set operation that assigns valuePrm
to the data slot with keyPrm
key.
Also cache does not know what backing-store is and how writing is done. It is given by user similar to the cache-miss function given to the constructor:
let cache = new Lru(cacheSize,
callbackBackingStoreLoad,
elementLifeTimeMs,
callbackBackingStoreSave);
callbackBackingStoreSave
function is called by cache whenever a cache slot has "edited" status set and needs to update the backing-store. Cache calls it with 3 parameters: Key
, value
and a callback
.
let cache = new Lru(cacheSize,
callbackBackingStoreLoad,
elementLifeTimeMs,
function(key,value,callback){
backing_store[key] = value;
callback();
});
This user-defined function is the data-storing part of eviction algorithm. callbackBackingStoreLoad
is the read-cache-miss eviction's data-loading part. Since a generic cache cannot know what to do with an SSD or a database, the user gives the necessary details by callbacks.
The first part of implementation that does a "write" on the backing-store is the out-of-date branch (when cache item's time has expired):
...
else
{
if(bufEdited[slot] == 1)
{
bufLocked[slot] = 1;
bufEdited[slot]=0;
mappingInFlightMiss.set(key,1);
inFlightMissCtr++;
saveData(bufKey[slot],bufData[slot],function(){
mappingInFlightMiss.delete(key);
bufLocked[slot] = 0;
inFlightMissCtr--;
mapping.delete(key);
access(key,function(newData){
callback(newData);
},aType,value);
});
}
else
{
mapping.delete(key);
access(key,function(newData){
callback(newData);
},aType,value);
}
}
What this part does:
- stops other in-flight operations from entering same cache slot or using same key
- this is important for evading duplicates
- clear edited status
- because it will be written to the backing-store soon
- save the data to backing-store (possibly asynchronously for a database or a file)
- after writing is complete, unlock the key and slot
- remove the key from mapping to "free" the slot for eviction purposes
- re-issue the same access (get/set)
- the re-issued access finds another slot to work with
The second part where a write occurs is near the eviction:
if(bufEdited[ctrFound] == 1)
{
if(aType == aTypeGet)
bufEdited[ctrFound] = 0;
saveData(oldKey,oldVal,function(){
if(aType == aTypeGet)
loadData(key,evict);
else if(aType == aTypeSet)
evict(value);
});
}
else
{
if(aType == aTypeSet)
bufEdited[ctrFound] = 1;
if(aType == aTypeGet)
loadData(key,evict);
else if(aType == aTypeSet)
evict(value);
}
Eviction is simple. Finding a victim slot has not changed but the data updating part is changed.
- if slot was edited,
- clear the edited status if it is a
get()
operation - save data to backing-store
- load new data if it is a
get()
operation, then evict - just evict if it is a
set()
operation
- else
- set the edited status if it is a
set()
operation - load data if its
get()
, then evict
Since cache is responsible for the latest written bits of data as well as very old written ones, a flush()
method is required to finally write all of its slots with edited status set (but not evicted yet). This cache implementation is passive so it only works when get and set are called. Due to this reason, some data by old calls to set()
may not reach to the backing-store. This is maintained by a flush()
call.
this.flush = function(callback){
function waitForReadWrite(callbackW){
if(mappingInFlightMiss.size > 0 || bufLocked.reduce((e1,e2)=>{return e1+e2;}) > 0)
{
setTimeout(()=>{ waitForReadWrite(callbackW); },0);
}
else
callbackW();
}
waitForReadWrite(function(){
for(let i=0;i<size;i++)
{
bufTime[i]=0;
if(bufEdited[i] == 1)
{
await me.setAwaitable(bufKey[i],bufData[i]);
}
}
callback();
});
};
};
- waits for all in-flight cache-misses (of all
set
and get
calls) - simulates
set
operations on all edited slots to indirectly update the backing-store
When cache is put in front of a database (to make database accessed less frequently for writes/reads), the latest unwritten data should be flushed to database before closing the application or after certain intervals. All operations issued before the flush are guaranteed to be sent to the backing-store but anything issued during a flush is not.
Benchmark
Test system: FX8150 @ 2.1GHz, 1-channel DDR3 1600 MHz 4GB RAM, Ubuntu 18.04 LTS 64bit.
Peak Read-Cache-Hit Performance Test
- Number of reads: 1million
- Concurrency: 5000
- Cache-size: 1000 elements
- Access: constant index (5) to force cache-read-hit on all except first access
"use strict";
let N_bench_repeats = 10;
let N_access = 1000000;
let N_concurrency = 5000;
let Lru = require("./lrucache.js").Lru;
let cache_elements = 1000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){
setTimeout(function(){
callback(Math.random() );
},100);
},cache_element_expire_ms, async function(key,value,callback){
setTimeout(function(){
callback();
},100);
});
async function benchmark(){
let request_keys = [];
for(let j=0;j<N_concurrency;j++)
{
request_keys.push(5);
}
let bench_time = Date.now();
for(let i=0;i<N_access;i+=N_concurrency)
{
let values = await cache.getMultipleAwaitable(... request_keys);
}
console.log("run-time: "+(Date.now() - bench_time)+"ms");
console.log("------------------------------------------");
}
function repeat(){
N_bench_repeats--;
if(N_bench_repeats>0)
benchmark().then(repeat);
}
repeat();
Output:
run-time: 926ms
------------------------------------------
run-time: 705ms
------------------------------------------
run-time: 676ms
------------------------------------------
run-time: 682ms
------------------------------------------
run-time: 679ms
------------------------------------------
run-time: 681ms
------------------------------------------
run-time: 682ms
------------------------------------------
run-time: 685ms
------------------------------------------
run-time: 679ms
------------------------------------------
680 milliseconds for 1million reads is equivalent to 1470 reads per millisecond.
Peak Write-Cache-Hit Performance Test
Same as the read version with the difference being setMultipleAwaitable
method call and key-value pair array given to the method:
async function benchmark(){
let pairs = [];
for(let j=0;j<N_concurrency;j++)
{
pairs.push({key:5, value:100});
}
let bench_time = Date.now();
for(let i=0;i<N_access;i+=N_concurrency)
{
let values = await cache.setMultipleAwaitable(... pairs);
}
console.log("run-time: "+(Date.now() - bench_time)+"ms");
console.log("------------------------------------------");
}
Output:
run-time: 717ms
------------------------------------------
run-time: 662ms
------------------------------------------
run-time: 672ms
------------------------------------------
run-time: 662ms
------------------------------------------
run-time: 649ms
------------------------------------------
run-time: 645ms
------------------------------------------
run-time: 648ms
------------------------------------------
run-time: 649ms
------------------------------------------
run-time: 643ms
------------------------------------------
This is 1550 writes per millisecond.
~1.5 million reads/writes per second is fast enough to feed some open-world video-games with textures, geometries, any world data (as long as player does not go out of the cached content) or some artificial intelligence programs with thousands of training data that may not fit into RAM.
Peak Write-Cache-Miss Performance Test
To test write-cache-miss performance, 100% unique key access is made by benchmark app. All accesses go through backing-store. In other words, 0% cache hit.
- Number of reads: 1million
- Concurrency: 5000
- Cache-size: 1000 elements
- Access: unique key to force cache-write-miss on all except first access
"use strict";
let N_bench_repeats = 10;
let N_access = 1000000;
let N_concurrency = 5000;
let Lru = require("./lrucache.js").Lru;
let cache_elements = 1000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){
setTimeout(function(){
callback(Math.random() );
},10);
},cache_element_expire_ms, async function(key,value,callback){
setTimeout(function(){
callback();
},10);
});
let unique_id = 0;
async function benchmark(){
let pairs = [];
for(let j=0;j<N_concurrency;j++)
{
pairs.push({key:unique_id++, value:Math.random()});
}
let bench_time = Date.now();
for(let i=0;i<N_access;i+=N_concurrency)
{
let values = await cache.setMultipleAwaitable(... pairs);
}
console.log("run-time: "+(Date.now() - bench_time)+"ms");
console.log("------------------------------------------");
}
function repeat(){
N_bench_repeats--;
if(N_bench_repeats>0)
benchmark().then(repeat);
}
repeat();
10 millisecond backing-store latency:
run-time: 17768ms
------------------------------------------
run-time: 17540ms
------------------------------------------
run-time: 17753ms
------------------------------------------
1 millisecond backing-store latency:
run-time: 6092ms
------------------------------------------
run-time: 5786ms
------------------------------------------
run-time: 5656ms
------------------------------------------
run-time: 5660ms
------------------------------------------
Zero-latency backing-store (assuming it can still serve data concurrently):
run-time: 6010ms
------------------------------------------
run-time: 5683ms
------------------------------------------
run-time: 5704ms
------------------------------------------
run-time: 5663ms
------------------------------------------
Synchronous backing-store (removing the setTimeout
in the cache miss parameters):
run-time: 1348ms
------------------------------------------
run-time: 1186ms
------------------------------------------
run-time: 1181ms
------------------------------------------
run-time: 1177ms
------------------------------------------
run-time: 1175ms
------------------------------------------
run-time: 1171ms
------------------------------------------
Write-cache-miss performance depends more on backing-store performance and its peak value is 1171 milliseconds per 1000000 operations or 853 cache-misses per millisecond.
Peak Read-Cache-Miss Performance Test
"use strict";
let N_bench_repeats = 10;
let N_access = 1000000;
let N_concurrency = 5000;
let Lru = require("./lrucache.js").Lru;
let cache_elements = 1000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){
setTimeout(function(){
callback(Math.random() );
},10);
},cache_element_expire_ms, async function(key,value,callback){
setTimeout(function(){
callback();
},10);
});
let unique_id = 0;
async function benchmark(){
let keys = [];
for(let j=0;j<N_concurrency;j++)
{
keys.push(unique_id++);
}
let bench_time = Date.now();
for(let i=0;i<N_access;i+=N_concurrency)
{
let values = await cache.getMultipleAwaitable(... keys);
}
console.log("run-time: "+(Date.now() - bench_time)+"ms");
console.log("------------------------------------------");
}
function repeat(){
N_bench_repeats--;
if(N_bench_repeats>0)
benchmark().then(repeat);
}
repeat();
10 millisecond backing-store latency:
run-time: 18255ms
------------------------------------------
run-time: 18033ms
------------------------------------------
run-time: 18248ms
------------------------------------------
1 millisecond backing-store latency:
run-time: 6039ms
------------------------------------------
run-time: 5759ms
------------------------------------------
run-time: 5786ms
------------------------------------------
Zero latency backing-store:
run-time: 5998ms
------------------------------------------
run-time: 5789ms
------------------------------------------
run-time: 5818ms
------------------------------------------
Synchronous backing-store (removing the setTimeout
in the cache miss parameters):
run-time: 1357ms
------------------------------------------
run-time: 1209ms
------------------------------------------
run-time: 1191ms
------------------------------------------
Peak cache-read-miss performance is 839 reads per millisecond.
When backing-store latency is 10 milliseconds, 1 million operations make a total latency of ~ 3 hours. Asynchronous caching lowered 3 hours of serial reading session time down to 18 seconds. Hiding ratio is 3 hours / 18 seconds = 600
Despite setting the concurrency to 5000, the effective concurrency is 600 for this specific benchmark code (with setTimeout = 10
millisecond setting). This could be a limitation of NodeJs (v14) or operating system or something else (1 channel memory = only 9-10 GB/s ).
Random Read+Write Combined Performance Test
- Number of reads: 1million, on a 1 million length plain Js array
- Number of writes: 1million, on same array using same cache
- Concurrency: 5000 x2
- Cache-size: 500k elements
- Access: random keys between 0 and 999999 to force 50% cache hit&miss on both reads and writes
"use strict";
let N_bench_repeats = 10;
let N_access = 1000000;
let N_concurrency = 5000;
let backing_store = {};
let cache_read_miss = 0;
let cache_write_miss = 0;
for(let i=0;i<N_access;i++)
{
backing_store[i]=i;
}
let Lru = require("./lrucache.js").Lru;
let cache_elements = 500000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){
cache_read_miss++;
callback(backing_store[key]);
},cache_element_expire_ms, async function(key,value,callback){
cache_write_miss++;
backing_store[key]=value;
callback();
});
async function benchmark(){
cache_read_miss = 0;
cache_write_miss = 0;
let bench_time = Date.now();
for(let i=0;i<N_access;i+=N_concurrency)
{
let ctr = 0;
let res = new Promise((success,fail)=>{
for(let j=0;j<N_concurrency;j++)
{
cache.get(Math.floor(Math.random()*N_access),
function(data){ ctr++; if(ctr==N_concurrency) success(1);});
cache.set(Math.floor(Math.random()*N_access), Math.random(),
function(data){ ctr++; if(ctr==N_concurrency) success(1);});
}
});
res = await res;
}
console.log("run-time: "+(Date.now() - bench_time)+"ms");
console.log("cache-read-hit-ratio: "+
(100*(N_access - cache_read_miss)/(N_access))+"%");
console.log("cache-write-hit-ratio: "+
(100*(N_access - cache_write_miss)/(N_access))+"%");
console.log("total-hit-ratio: "+
(100*(2*N_access - cache_write_miss - cache_read_miss)/(2*N_access))+"%");
console.log("------------------------------------------");
}
function repeat(){
N_bench_repeats--;
if(N_bench_repeats>0)
benchmark().then(repeat);
}
repeat();
Output:
>
run-time: 4392ms
cache-read-hit-ratio: 42.3823%
cache-write-hit-ratio: 59.5218%
total-hit-ratio: 50.95205%
------------------------------------------
run-time: 3712ms
cache-read-hit-ratio: 49.9964%
cache-write-hit-ratio: 33.3312%
total-hit-ratio: 41.6638%
------------------------------------------
run-time: 3909ms
cache-read-hit-ratio: 49.9807%
cache-write-hit-ratio: 33.296%
total-hit-ratio: 41.63835%
------------------------------------------
run-time: 4123ms
cache-read-hit-ratio: 49.9926%
cache-write-hit-ratio: 33.2964%
total-hit-ratio: 41.6445%
------------------------------------------
run-time: 3943ms
cache-read-hit-ratio: 50.0449%
cache-write-hit-ratio: 33.3457%
total-hit-ratio: 41.6953%
------------------------------------------
run-time: 4102ms
cache-read-hit-ratio: 49.9443%
cache-write-hit-ratio: 33.3026%
total-hit-ratio: 41.62345%
------------------------------------------
run-time: 3816ms
cache-read-hit-ratio: 50.0713%
cache-write-hit-ratio: 33.2516%
total-hit-ratio: 41.66145%
------------------------------------------
3816 milliseconds to do 2million operations = 524 operations per millisecond. This is the performance of mixed reads and writes in same loop when cache size is half the size of the dataset. Read operations can still force eviction of edited pages (and causes write-miss) but write operations do not cause any extra cache-read-miss, that's why cache-write-ratio decreases.
Cache size = 800k (80% of the dataset):
run-time: 3724ms
cache-read-hit-ratio: 56.1437%
cache-write-hit-ratio: 95.1107%
total-hit-ratio: 75.6272%
------------------------------------------
run-time: 3143ms
cache-read-hit-ratio: 79.9688%
cache-write-hit-ratio: 70.1968%
total-hit-ratio: 75.0828%
------------------------------------------
run-time: 3406ms
cache-read-hit-ratio: 79.9668%
cache-write-hit-ratio: 66.9695%
total-hit-ratio: 73.46815%
------------------------------------------
run-time: 3078ms
cache-read-hit-ratio: 80.0162%
cache-write-hit-ratio: 66.612%
total-hit-ratio: 73.3141%
------------------------------------------
run-time: 3379ms
cache-read-hit-ratio: 80.0395%
cache-write-hit-ratio: 66.5492%
total-hit-ratio: 73.29435%
------------------------------------------
run-time: 3536ms
cache-read-hit-ratio: 79.9812%
cache-write-hit-ratio: 66.5147%
total-hit-ratio: 73.24795%
------------------------------------------
run-time: 3089ms
cache-read-hit-ratio: 79.9878%
cache-write-hit-ratio: 66.5399%
total-hit-ratio: 73.26385%
------------------------------------------
Cache size = 990k (99% of the dataset):
run-time: 4558ms
cache-read-hit-ratio: 56.7343%
cache-write-hit-ratio: 100%
total-hit-ratio: 78.36715%
------------------------------------------
run-time: 2916ms
cache-read-hit-ratio: 94.1668%
cache-write-hit-ratio: 100%
total-hit-ratio: 97.0834%
------------------------------------------
run-time: 2757ms
cache-read-hit-ratio: 98.8937%
cache-write-hit-ratio: 98.9428%
total-hit-ratio: 98.91825%
------------------------------------------
run-time: 2708ms
cache-read-hit-ratio: 98.9939%
cache-write-hit-ratio: 98.1394%
total-hit-ratio: 98.56665%
------------------------------------------
run-time: 2749ms
cache-read-hit-ratio: 99.0013%
cache-write-hit-ratio: 98.1331%
total-hit-ratio: 98.5672%
------------------------------------------
Sequential Write+Read Combined Performance Test
- Number of reads: 1million, on a 1 million length plain Js array
- Number of writes: 1million, on same array using same cache and same key but just before the read
- Concurrency: 5000 x2
- Cache-size: 500k elements
- Access: sequential keys between 0 and 999999
"use strict";
let N_bench_repeats = 10;
let N_access = 1000000;
let N_concurrency = 5000;
let backing_store = {};
let cache_read_miss = 0;
let cache_write_miss = 0;
for(let i=0;i<N_access;i++)
{
backing_store[i]=i;
}
let Lru = require("./lrucache.js").Lru;
let cache_elements = 500000;
let cache_element_expire_ms = 100000;
let cache = new Lru(cache_elements, async function(key,callback){
cache_read_miss++;
callback(backing_store[key]);
},cache_element_expire_ms, async function(key,value,callback){
cache_write_miss++;
backing_store[key]=value;
callback();
});
async function benchmark(){
cache_read_miss = 0;
cache_write_miss = 0;
let bench_time = Date.now();
for(let i=0;i<N_access;i+=N_concurrency)
{
let ctr = 0;
let res = new Promise((success,fail)=>{
for(let j=0;j<N_concurrency;j++)
{
cache.set(i+j, Math.random(),function(data)
{ ctr++; if(ctr==N_concurrency) success(1);});
cache.get(i+j, function(data)
{ ctr++; if(ctr==N_concurrency) success(1);});
}
});
res = await res;
}
console.log("run-time: "+(Date.now() - bench_time)+"ms");
console.log("cache-read-hit-ratio: "+
(100*(N_access - cache_read_miss)/(N_access))+"%");
console.log("cache-write-hit-ratio: "+
(100*(N_access - cache_write_miss)/(N_access))+"%");
console.log("total-hit-ratio: "+
(100*(2*N_access - cache_write_miss - cache_read_miss)/(2*N_access))+"%");
console.log("------------------------------------------");
}
function repeat(){
N_bench_repeats--;
if(N_bench_repeats>0)
benchmark().then(repeat);
}
repeat();
Output:
run-time: 3160ms
cache-read-hit-ratio: 100%
cache-write-hit-ratio: 50%
total-hit-ratio: 75%
------------------------------------------
run-time: 2862ms
cache-read-hit-ratio: 100%
cache-write-hit-ratio: 0%
total-hit-ratio: 50%
------------------------------------------
run-time: 2869ms
cache-read-hit-ratio: 100%
cache-write-hit-ratio: 0%
total-hit-ratio: 50%
------------------------------------------
run-time: 2820ms
cache-read-hit-ratio: 100%
cache-write-hit-ratio: 0%
total-hit-ratio: 50%
Read operations always benefited from write operations with same keys but every write after second repeat caused a write miss due to not having any data re-use for the writes.
Latest version source code of cache:
'use strict';
let Lru = function(cacheSize,callbackBackingStoreLoad,elementLifeTimeMs=1000,
callbackBackingStoreSave){
const me = this;
const aTypeGet = 0;
const aTypeSet = 1;
const maxWait = elementLifeTimeMs;
const size = parseInt(cacheSize,10);
const mapping = new Map();
const mappingInFlightMiss = new Map();
const bufData = new Array(size);
const bufVisited = new Uint8Array(size);
const bufEdited = new Uint8Array(size);
const bufKey = new Array(size);
const bufTime = new Float64Array(size);
const bufLocked = new Uint8Array(size);
for(let i=0;i<size;i++)
{
let rnd = Math.random();
mapping.set(rnd,i);
bufData[i]="";
bufVisited[i]=0;
bufEdited[i]=0;
bufKey[i]=rnd;
bufTime[i]=0;
bufLocked[i]=0;
}
let ctr = 0;
let ctrEvict = parseInt(cacheSize/2,10);
const loadData = callbackBackingStoreLoad;
const saveData = callbackBackingStoreSave;
let inFlightMissCtr = 0;
this.reload=function(){
for(let i=0;i<size;i++)
{
bufTime[i]=0;
}
};
this.reloadKey=function(key){
if(mapping.has(key))
{
bufTime[mapping[key]]=0;
}
};
this.get = function(keyPrm,callbackPrm){
access(keyPrm,callbackPrm,aTypeGet);
};
this.set = function(keyPrm,valuePrm,callbackPrm){
access(keyPrm,callbackPrm,aTypeSet,valuePrm);
};
function access(keyPrm,callbackPrm,aType,valuePrm){
const key = keyPrm;
const callback = callbackPrm;
const value = valuePrm;
if(inFlightMissCtr>=size)
{
setTimeout(function(){
access(key,function(newData){
callback(newData);
},aType,value);
},0);
return;
}
if(mappingInFlightMiss.has(key))
{
setTimeout(function(){
access(key,function(newData){
callback(newData);
},aType,value);
},0);
return;
}
if(mapping.has(key))
{
let slot = mapping.get(key);
if((Date.now() - bufTime[slot]) > maxWait)
{
if(bufLocked[slot])
{
setTimeout(function(){
access(key,function(newData){
callback(newData);
},aType,value);
},0);
}
else
{
if(bufEdited[slot] == 1)
{
bufLocked[slot] = 1;
bufEdited[slot]=0;
mappingInFlightMiss.set(key,1);
inFlightMissCtr++;
saveData(bufKey[slot],bufData[slot],function(){
mappingInFlightMiss.delete(key);
bufLocked[slot] = 0;
inFlightMissCtr--;
mapping.delete(key);
access(key,function(newData){
callback(newData);
},aType,value);
});
}
else
{
mapping.delete(key);
access(key,function(newData){
callback(newData);
},aType,value);
}
}
}
else
{
bufVisited[slot]=1;
bufTime[slot] = Date.now();
if(aType == aTypeSet)
{
bufEdited[slot] = 1;
bufData[slot] = value;
}
callback(bufData[slot]);
}
}
else
{
let ctrFound = -1;
let oldVal = 0;
let oldKey = 0;
while(ctrFound===-1)
{
if(!bufLocked[ctr] && bufVisited[ctr])
{
bufVisited[ctr]=0;
}
ctr++;
if(ctr >= size)
{
ctr=0;
}
if(!bufLocked[ctrEvict] && !bufVisited[ctrEvict])
{
bufLocked[ctrEvict] = 1;
inFlightMissCtr++;
ctrFound = ctrEvict;
oldVal = bufData[ctrFound];
oldKey = bufKey[ctrFound];
}
ctrEvict++;
if(ctrEvict >= size)
{
ctrEvict=0;
}
}
mappingInFlightMiss.set(key,1);
let evict = function(res){
mapping.delete(bufKey[ctrFound]);
bufData[ctrFound]=res;
bufVisited[ctrFound]=0;
bufKey[ctrFound]=key;
bufTime[ctrFound]=Date.now();
bufLocked[ctrFound]=0;
mapping.set(key,ctrFound);
callback(res);
inFlightMissCtr--;
mappingInFlightMiss.delete(key);
};
if(bufEdited[ctrFound] == 1)
{
if(aType == aTypeGet)
bufEdited[ctrFound] = 0;
saveData(oldKey,oldVal,function(){
if(aType == aTypeGet)
loadData(key,evict);
else if(aType == aTypeSet)
evict(value);
});
}
else
{
if(aType == aTypeSet)
bufEdited[ctrFound] = 1;
if(aType == aTypeGet)
loadData(key,evict);
else if(aType == aTypeSet)
evict(value);
}
}
};
this.getAwaitable = function(key){
return new Promise(function(success,fail){
me.get(key,function(data){
success(data);
});
});
}
this.setAwaitable = function(key,value){
return new Promise(function(success,fail){
me.set(key,value,function(data){
success(data);
});
});
}
this.getMultiple = function(callback, ... keys){
let result = [];
let ctr1 = keys.length;
for(let i=0;i<ctr1;i++)
result.push(0);
let ctr2 = 0;
keys.forEach(function(key){
let ctr3 = ctr2++;
me.get(key,function(data){
result[ctr3] = data;
ctr1--;
if(ctr1==0)
{
callback(result);
}
});
});
};
this.setMultiple = function(callback, ... keyValuePairs){
let result = [];
let ctr1 = keyValuePairs.length;
for(let i=0;i<ctr1;i++)
result.push(0);
let ctr2 = 0;
keyValuePairs.forEach(function(pair){
let ctr3 = ctr2++;
me.set(pair.key,pair.value,function(data){
result[ctr3] = data;
ctr1--;
if(ctr1==0)
{
callback(result);
}
});
});
};
this.getMultipleAwaitable = function(... keys){
return new Promise(function(success,fail){
me.getMultiple(function(results){
success(results);
}, ... keys);
});
};
this.setMultipleAwaitable = function(... keyValuePairs){
return new Promise(function(success,fail){
me.setMultiple(function(results){
success(results);
}, ... keyValuePairs);
});
};
this.flush = function(callback){
function waitForReadWrite(callbackW){
if(mappingInFlightMiss.size > 0 ||
bufLocked.reduce((e1,e2)=>{return e1+e2;}) > 0)
{
setTimeout(()=>{ waitForReadWrite(callbackW); },10);
}
else
callbackW();
}
waitForReadWrite(async function(){
for(let i=0;i<size;i++)
{
bufTime[i]=0;
if(bufEdited[i] == 1)
{
await me.setAwaitable(bufKey[i],bufData[i]);
}
}
callback();
});
};
};
exports.Lru = Lru;
Cache Hit Ratio By Image Softening Sample (From Github)
It applies 5-point smoothing kernel on a 8x8 image using a cache size of 25 and writes every result pixel onto an output buffer through another cache of same size. Then outputs cache hit ratios. Since every other row of pixels use the rows before them (or every pixel uses closes 4 neighboring pixels), there is some data-reuse ratio bigger than 1. This pattern allows just 25 elements of cache effectively cache ~75% of the 384 read/write accesses.
Output:
run-time: 5704 ms 320 reads 64 writes
cache read hit ratio=72.91666666666667
cache write hit ratio=77.08333333333333
History
- 8th April, 2021
- Basic coverage of algorithm, some samples
- 19th September, 2021
- Added item-refresh capability (as a postponed "write" op) and optimized asynchronity by solving the deadlock problem
- 25th September, 2021
- Added method for multiple key-value request to avoid callback-hell and some benchmarking results
- 29th September, 2021
- Added write-cache feature, different optimizations and new benchmarks
- Updated title with "asynchronous" word
- Changed tags to a better set