Introduction
Having known all my years as a programmer what caching is and what it is used for, I have never once been in a situation where it was critically necessary. We all know how caching can dramatically boost performance in certain situations, but more often than not, developers do not use caching in those situations.
In this article I will take you through some code for developing a simple but generic caching object. Each item in the cache is stored as an object
. The cache will use the Least-Recently-Used (LRU) replacement method to expire cached items. I plan writing a follow up article on a more advanced cache with more features and more replacement methods supported.
A Quick Overview of the Process
Very generally speaking, a user (usually a piece of software) requires some resource. This resource can really be anything including large files, datasets/datareaders or a resource over the web to name a few. Whenever the acquisition of such a resource is an expensive operation in terms of time and/or CPU cycles, the resource becomes a candidate for caching.
The user will typically request this resource from the cache first to avoid having to undergo the expensive acquisition of it. If the resource is contained in the cache, it is returned to the requestor, otherwise, it is acquired elsewhere and handed to the cache to store for future retrieval.
The cache explained in this article works on this model. The user or requestor is an application or service. It will reference the GenericCache class library. The generic cache provides a callback to acquire a resource from the user and add it to the cache. When the user wants to acquire the resource, the Fetch()
method is called and the item is retrieved. If the item is not in the cache, the GenericCache
object will use the user's implementation of the callback to acquire the resource, add it to the cache and return the cached resource.
Using the code
The source code provided in the downloads section is a complete Visual Studio 2003 project with the solution file.
Note: It is assumed that the reader is familiar with the Dictionary
data type and the key-value pair model that a dictionary follows. This data type is the basis for this caching library.
Containing the Cached Items
In my search to find a suitable container class for the cached items, I came across the NameObjectCollectionBase
abstract class provided by the .NET Framework in the System.Collections.Specialized
namespace. This class provides functionality to store items in a hash table fashion as well as in an indexed ArrayList
. This suited the need I had to quickly access items in a FIFO manner and also access the items with unique keys. Hash table keys are, by their nature, not unique but in the derived class I did not allow duplicate keys to be used.
Let's have a look at the class:
using System;
using System.Collections.Specialized;
namespace GenericCache
{
internal class Dictionary : NameObjectCollectionBase
{
public Dictionary( int p_initialCapacity ) : base( p_initialCapacity )
{
}
public object Remove( out string p_key )
{
object toReturn;
if ( this.Count < 1 )
{
p_key = null;
return null;
}
toReturn = this.BaseGet( 0 );
p_key = this.BaseGetKey( 0 );
this.BaseRemoveAt( 0 );
return toReturn;
}
public object this[ string p_key ]
{
get
{
return ResetItem( p_key );
}
}
public void Add( string p_key, ref object p_item )
{
this.BaseAdd( p_key, p_item );
}
object ResetItem( string p_key )
{
object tempItem;
tempItem = this.BaseGet( p_key );
if ( tempItem != null )
{
this.BaseRemove( p_key );
this.BaseAdd( p_key, tempItem );
}
return tempItem;
}
public void Clear()
{
this.BaseClear();
}
}
}
Most of the code is self explanatory but for clarity, I will explain what some of the methods do:
public object Remove( out string p_key );
The replacement method used for expired cache items, is the Least-Recently-Used method. This method expires cache items, from the ones that have been accessed long ago, to the ones that have been accessed most recently. The NameObjectCollectionBase
class uses an ArrayList
underlying, this provides us with the functionality to remove the oldest items first. Later on we'll see that when a cached item is accessed, it is removed from the container and then added again to reset its age.
First this method checks that the container contains items. If not, it simply returns null
and the out
parameter is set to null
as well. If it does contain items, the item in the underlying ArrayList
with index 0 is removed. This ensures that the least recently used items are removed first. The method returns the removed item and sets the out
p_key
to the removed item's key.
object ResetItem( string p_key );
This private method resets an item's age by removing it from the underlying ArrayList
(from a specific index) and adding it again at the end of the underlying ArrayList
.
Two Events Defined
In our main Cache
object (discussed a little further down), two events and two delegates are defined:
public delegate void CacheItemExpiredEventHandler( object p_source, CacheItemExpiredEventArgs p_e );
public event CacheItemExpiredEventHandler CacheItemExpired;
The CacheItemExpiredEventHandler
is a delegate to handle the event of a cache item expiring. This event is fired whenever an item has been removed from the cache as a result of the capacity being reached. The following is the definition for the CacheItemExpiredEventArgs
class:
using System;
namespace GenericCache
{
public class CacheItemExpiredEventArgs
{
string key;
object item;
public CacheItemExpiredEventArgs( string p_key, ref object p_item )
{
key = p_key;
item = p_item;
}
public object Item
{
get
{
return item;
}
}
public string Key
{
get
{
return key;
}
}
}
}
It simply holds a reference to the object that has expired and the key that was used for access to that object.
public delegate object FetchItemEventHandler( object p_source, FetchItemEventArgs p_e );
public event FetchItemEventHandler FetchItem;
The FetchItemEventHandler
is a delegate to handle the event of a cache item being fetched from the user for caching. This event is fired whenever an item has been requested from the cache but was not found. The cache then asks the user to provide the item as a returned object in the event implementation. The following is the definition for the FetchItemEventArgs
class:
using System;
namespace GenericCache
{
public class FetchItemEventArgs
{
string key;
public FetchItemEventArgs( string p_key )
{
key = p_key;
}
public string Key
{
get
{
return key;
}
}
}
}
It simply holds a reference to the key that was used in acquiring the item.
This event has a return type of object
. The user "catching" this event provides the requested item by returning it from the event implementation.
The Cache Class
Here is the code and an explanation for some of the methods following it:
using System;
using System.Threading;
namespace GenericCache
{
public delegate void CacheItemExpiredEventHandler( object p_source,
CacheItemExpiredEventArgs p_e );
public delegate object FetchItemEventHandler( object p_source,
FetchItemEventArgs p_e );
public class Cache
{
public event CacheItemExpiredEventHandler CacheItemExpired;
public event FetchItemEventHandler FetchItem;
Dictionary dictionary;
int capacity;
ReaderWriterLock dictionaryLock;
public Cache( int p_initialSize, int p_capacity ) : base()
{
if ( p_initialSize < 1 )
p_initialSize = 1;
if ( p_capacity < 0 )
p_capacity = 0;
dictionary = new Dictionary( p_initialSize );
capacity = p_capacity;
dictionaryLock = new ReaderWriterLock();
}
public object Fetch( string p_key )
{
object tempItem;
dictionaryLock.AcquireReaderLock( -1 );
try
{
tempItem = dictionary[ p_key ];
}
finally
{
dictionaryLock.ReleaseReaderLock();
}
if ( tempItem != null )
{
return tempItem;
}
if ( FetchItem == null )
return null;
tempItem = FetchItem( this, new FetchItemEventArgs( p_key ) );
if ( tempItem == null )
return null;
RemoveItems();
dictionaryLock.AcquireWriterLock( -1 );
try
{
dictionary.Add( p_key, ref tempItem);
}
finally
{
dictionaryLock.ReleaseWriterLock();
}
return tempItem;
}
public int Count
{
get
{
dictionaryLock.AcquireReaderLock( -1 );
try
{
return dictionary.Count;
}
finally
{
dictionaryLock.ReleaseReaderLock();
}
}
}
public void Clear()
{
dictionaryLock.AcquireWriterLock( -1 );
dictionary.Clear();
dictionaryLock.ReleaseWriterLock();
}
void RemoveItems()
{
string tempKey;
object tempItem;
dictionaryLock.AcquireWriterLock( -1 );
try
{
if ( capacity == 0 )
return;
while ( capacity - 1 < dictionary.Count )
{
tempItem = dictionary.Remove( out tempKey );
if ( CacheItemExpired != null )
CacheItemExpired( this,
new CacheItemExpiredEventArgs( tempKey,
ref tempItem ) );
}
}
finally
{
dictionaryLock.ReleaseWriterLock();
}
}
public int Capacity
{
get
{
return capacity;
}
set
{
capacity = value;
if ( capacity < 0 )
capacity = 0;
}
}
}
}
This class is the only one that will be publicly exposed to the user. All the other classes so far have been internal
classes as it is used only by the Cache
class. Cache
is completely thread-safe.
The Cache
class holds only the two events discussed earlier and two private variables namely:
Dictionary dictionary;
int capacity;
The dictionary
variable holds an instance of the Dictionary
class also discussed earlier. The capacity
variable holds the maximum number of items that can be held by the cache before it starts expiring items. This capacity can be set to 0 which will indicate that the user does not want items to be expired ever.
The following methods are discussed:
public Cache( int p_initialSize, int p_capacity ) : base()
- The only constructor.
The two if
statements force p_initialSize
to be of minimum 1 and p_capacity
to be of minimum 0. The local variable capacity
is set to p_capacity
and an instance of a Dictionary
is created and stored in dictionary
.
void RemoveItems();
The RemoveItems
private method will remove items from the dictionary until the number of items is 1 less than the capacity of the cache. It is called before an item is inserted into the dictionary and ensures that the item can be inserted without further capacity checking. With each removal of an item, the CacheItemExpired
event is raised to inform the user that an item has expired.
public object Fetch( string p_key );
A user would use this public method to acquire a cached item from the cache. The key for the required object is provided through the p_key
parameter.
The dictionary
indexer is called by tempItem = dictionary[ p_key ];
. In this method, the item's age is automatically reset if it exists in the dictionary. If the item is found in the cache, it is simply returned.
If the item is not found, tempItem
would be null and the item must be fetched from the user. The FetchItem
event is raised, and what is returned as a result is the item to be cached. Since the cache does not store nulls, null
is returned if the event yielded a null value and nothing is added to the cache. RemoveItems()
is called to ensure that there is space in the dictionary to add a new item. The newly acquired item is added to the cache and returned to be used by the user.
Conclusion
Although there are many more (advanced) features one can add to such a cache implementation, I felt that what I have discussed in this article, is the bare minimum one would need to add efficiency to expensive resource acquisition. Readers of this article can contact me on the forum message board below, should they have any questions or comments regarding this article.
History
- 2005-10-17 - Used a
ReaderWriterLock
instead of a normal lock. - 2005-10-06 - Article submitted.