A Quick Performance Trick
Take a look at this peculiar thing:
public class StringCache
{
private readonly Dictionary<string, string> _cache;
public StringCache()
{
_cache = new Dictionary<string, string>();
}
public virtual string Get(string value)
{
if (value == null) return null;
string result;
if (!_cache.TryGetValue(value, out result))
{
result = value;
_cache[result] = result;
}
return result;
}
}
As you can see, it has a Dictionary
mapping one string
to another with both key and value being identical, which seems fairly pointless. But it is useful, as when you call the Get
method, it will either return a reference to what you supply if it's not seen it before, or a reference to a previous instance of the same thing if it has.
I created this as an experiment to use in the Data Access Layer of a project I was working on. ADO was calling a stored procedure which was returning a lot of rows, and often I would see the same string
value appearing for the same field over and over again. Millions of times in some cases. You might think this is indicative of a less than perfectly normalised database, to which I would reply that it's a 'real world' example. :)
Anyway, I figured I could save some memory by using this - lots of references to the same string
instance rather than lots of references to individual instances of the same string
. At the start of the read loop, I'd create one of these, pass the string
s through it and let it drop out of scope at the end of the loop.
It reduced the memory footprint of the service nicely, but I was expecting to take a hit in the overall read time, after all, each string
needs a hashcode calculating and some dictionary 'stuff' doing additionally. In fact, it speeded the read up. Not really sure why, there's more to do but perhaps there was room for that based on the speed of the IO being the bottleneck. That could only make it the same speed, not faster, so I assume it's something to do with garbage collection.
Anyway who cares. I've retrofitted these little caches in lots of DAL functions which return endless records and the result is faster times and less memory consumption.
Simple, and a little bit beautiful.