Pfz.Caching - ViewIds instead of ViewStates

Paulo Zemek

4.85/5 (13 votes)

10 May 2010CPOL8 min read

44K

241

Framework for caching data that includes the possibility to store ViewStates in files, reutilizing identical files as an way to avoid too much HD usage

Introduction

This is a framework for caching serializable objects in general. It has the Cache generic class that caches any object in memory/disk. If it is no more in memory, it is read from disk.

It also has a CacheDictionary that does the same, but has some optimization for small data, avoiding many "buffer files" to be generated.

And, finally, the best of all: Adding the App_Browsers in your project, you can make all viewstates use this technology, so only a small ViewId is sent to the client instead of the full viewstate. Also, it deletes unused files after some time and reutilizes identical files, avoiding wasting HD space.

Background

I didn't really face many problems when working with the web. I started to program when computers had only 1MB or 2MB. But, I see that people in general are a bit lost on when to store the data. Documentation on ViewState and Sessions is also confusing, they look interchangeable, when they are not.

I first created a caching technology for paginating recordsets.

Then, I created the cache technology for any data.

And, finally, the module to do it for the ViewStates. And it is really great, and working for over a year without problems.

How It Works (Basic)

The implementation can be very complex, as the code is thread-safe and you need to manage garbage collection properly, but the principle is very simple:

Every object "buffer" is serialized and a hashcode (checksum) of that buffer is created. If a directory for that buffer exists, I look if there is some buffer with the same length and, if there is, if it is identical, so avoiding to create a new one, or I create a new unique id, save the file and return the id and hashcode of that buffer. This is shared among all sessions, but as the data is read-only, there is no problem.
The item 1 is the BufferId, not the ViewId. Now, with the bufferId, I try to find if a ViewState information in memory for the actual session which contains that bufferId. If none contains, a new ViewId is generated. If one exists, the Id of the old one is returned. Obviously, every time I reutilize a file, I do a keep-alive (in file, updating the date/time of the file, and in memory, calling GCUtils.KeepAlive).
Every buffer is also kept in memory with weak-references, but to avoid recent buffers to be collected, I call GCUtils.KeepAlive. The GCUtils.KeepAlive (not GC.KeepAlive) guarantees that an object will survive the next collection.
The ViewIds are created for the sessionId, so there is no problem of one user getting the viewstate of the other, even if the internal buffer is the same (in which case, every user will have different ViewIds, but they will point to the same buffer). Also, every page generates a NEW viewstate, and so I try to get the Id of an existing viewstate if possible, or I create a new file.
By default, FileCachePersister runs a process at 30 minutes, deleting files with more than 4 hours. This has nothing to do with session-expiration times.

How It Works (Advanced)

At this moment, I will not explain the details of the CacheDictionary (which is effectively a dictionary of Cache objects, but with some optimizations) and also not explain the WeakDictionary, because its explanation itself would be bigger than this entire article. But the important thing is to know that it is a dictionary that allows its items to be collected by the garbage collector, but keeps its recently used values alive.
I will start explaining the CacheManager. The CacheManager class is responsible for loading and saving the buffers (bytes), as it does not know anything of the real type of the object. The most important thing is that it has is a WeakDictionary where the keys are the HashCodes of the buffers, and the values are Dictionaries of the Ids and the serialized bytes. Its internal functions try to find a value in memory using the hashcode and the buffer id and, if none is not found, ask for the persister to load it and then store the loaded value (if any) in these weak-dictionaries, doing a KeepAlive on them. The Save function does a similar process, trying to find a compatible buffer in memory, to reutilize the Id or, if one is not found, call the persister to save and return the generated Id.

The Cache generic class is the one with the capacity to serialize and deserialize buffers. It uses the CacheManager to read or write these buffers, but the Cache itself has its own WeakReference for the effective object. This is done because, when deserializing a cache object, only the id of the object is needed, not the real object. If too many "identical" objects are put in cache, all the cache objects can have their effective objects collected, but maybe the buffer to regenerate them is still in memory. It can look redundant, but in my experience, it is not.
Well, so, your create a Cache for an object. The cache serializes the object and calls CacheManager, which will try to reutilize the id of some identical buffer or will ask for it to be saved... but, where will it be saved?

That's the job of the persister. Within the framework, the only Persister that exists and is already useable is the FileCachePersister. It simple receives the parameters and tries to load the file, if one exists, or returns null. It receives the name of the file and tries to update its date/time to keep it alive, or saves the file. This is done in such a way so that you can easily create a Persister to store data into the database, or use a remote server, which can have its own caching also, avoiding concurrent processes to access the same files at the same time. This is important, as the FileCachePersister works very well with many threads, but only ONE process must be using the directory.
Ok, in the FileCachePersister there is a thread to delete the old files, but that's nothing really complex.

The ViewState

The ViewState solution is very similar to the Cache solution, but it also has additional security information. The class responsible for loading and saving ViewStates is the PfzPageStatePersister. Similar to the CacheManager class, it has a dictionary composed of SessionIds, so the ViewIds are exclusives to the actual session, then the values are dictionaries of ViewIds, and the values of that dictionary are the type of the page that generated the ViewState (so copying the ViewId to another page will not work) and the effective information of the ViewState.
Or, better, a cache to such value. Why? Because if you go from one page to another, generating identical viewstates, only a new "reference" to the buffer will be generated, but the buffer, which can be very large, is the same. It looks a little more complicated, as it has a Pair, but that's because of the way PageStatesPersisters works in general, as they only generate two objects, which the only purpose to be serialized. Not very friendly to be honest.

But, then, the idea is the same:
Look if the ViewState is in memory. If it is not, ask for the Persister to load it.
When saving, search one identical ViewState in memory, or create a new one, calling the persister to save it.

Using the Code

The CacheManager class is where it all starts. And, if you use the framework only to save viewstates in files, is where it ends. In the Global.asax, put the following (or something similar):

CacheManager.Persister = new FileCachePersister
 ("c:\\temp\\PfzCachingWebApplication_ViewStates\\");
CacheManager.Start();

Other Interesting Things

GCUtils - I already presented this class in another article, but it is modified now. Registering to the Collected event, you can be informed of a recently happened collection, if you have any additional memory you can free (like calling TrimExcess). The GCUtils.KeepAlive is also very useful, as a way to tell the GarbageCollector to avoid collecting recently used objects, even if they only have weak-references.

Cache class usage - You can use the Cache class even in Windows Forms, or if you really need to put some large information in the Session, instead of putting it on the ViewState.

For example, when you put the object in the session, you create the cache object and put the cache object in the session:

Session["MyVeryLargeItem"] = new Cache<byte[]>(new byte[5000000]);

And, to read it, you do:

byte[] data = ((Cache<byte[]>)Session["MyVeryLargeItem"]).Target;

In normal code, you would still need to cast, but you will generally not create the cache and not cast it back to a cache to then read the target. But, this additional step will make your 5MB object become something like 32bytes in session, while the 5MB are kept in files.

Future Articles on the Same Theme

In future articles, I plan to explain with good examples the usage of the Cache class, the CacheDictionary, WeakDictionary and specially the StatedPage. But, those are part of my personal framework, not really a part of the ViewState solution, as they work independently of the ViewState, and can even be used in non-web applications.

For this article, I only wanted to show the ViewState solution, which really works (is in production for a long time now) and does not change the way programming is done, except for the fact that now you can paginate records and store the full dataset in ViewState, as it will not be sent to the client.

History

7^th October, 2009: Initial post

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)