Introduction
Unlike memecached, Redis can also be used for persistent storage, and not just as a volatile cache. As it happens, Redis is a blazingly fast database, and gives strikingly better performance to your application if used correctly. Of course, as a caveat, I would like to add that there are also risks associated with using Redis as a primary data store, and these risks are significantly enhanced if it is configured incorrectly; it is therefore advised that you do due research on this before deciding to scrap your "regular" SQL database in favor of Redis.
For all the performance boost that it gives to your application owing to its amazing speed, the fact remains that Redis is fundamentally a key/value store, and does not support indexes. It can therefore be a bit of a challenge when you need to index your values, so that you are later able to search and retrieve values using these indexes. As it turns out, we can work our way around this limitation by using Redis' remarkably useful, natively-provided data types. In this article, we explore how we can use both sets and sorted sets to both index and sort records by dates, and also retrieve rows within date ranges.
Background
I have been using MySQL for a while now, and have grown to love it over the years. For that matter, I use it for all kinds of enterprise applications, and also when I am free to have a database of my own choosing for a .NET one. I'm not about to trash MySQL any time soon; nonetheless, I mention it here so that we have a proper frame of reference against which I can say what follows.
I had long dismissed Redis as just an equivalent (and perhaps faster) product, vis-a-vis memcached. I later came across a very good article that Redis can also be used for persistent storage, and that got me thinking. So one weekend I found myself -- more out of curiosity than anything else-- doing a quick spike with Redis, which turned out quite satisfactorily. Following this, I decided to give it a real try: I have been working on a .NET application and -- since the project is still in its nascent stages -- decided to give Redis a shot as the primary persistent storage engine, doing away entirely with MySQL in the deal. I ran some unit tests on both, the Redis version of my code, as well as the (previous) MySQL version, and the results were astonishing: the performance increase was easily by a factor of 5X! Of course, this was in my particular case, and your mileage may vary.
Having said this, the "problem" with Redis is that, basically being a key/value store, it does not support indexes. Coming from the SQL world, this is a very critical requirement for me, and this too was easily solved using Redis' rich data type system. Redis does not just support simple scalar values (simplistically called "string
," which is really a misnomer, since it can store a whole lot of things beyond just string
s), it also supports compound types like lists, sets, hashes, and yes, sorted sets, one of the data types that I will be talking about in this article.
Prior to this, I was quite flummoxed as to how to index dates, especially when it comes to retrieving records by date ranges. As it turns out, this can easily be accomplished using the sorted sets data type in Redis.
About Redis Complex Types
Redis supports quite a few complex types. As mentioned in the previous section, we have lists, sets, sorted sets, and hashes, though we won't be talking about lists and hashes in this article. However, we do use the other two types, that is, sets and sorted sets, extensively here.
Sets are exactly that: a collection of unique values with no notion of order amongst them. Sorted sets are slightly different beasts though. They allow one to save values with scores and thus at all times the members of a sorted set can be accessed using their scores and even a range of scores. As it turns out, this can be quite handy while storing dates. We convert every single date to be stored into its 'tick' value, thereby getting an ordered set of values. We can then use Redis' ZRANGEBYSCORE
command to retrieve values that fall within a particular range.
Redis as a key-value store can store just about any type of value in the database, as long it has a unique key defined for it, regardless of whether the value is a string
type or a complex type.
Using the Code
The code is not terribly efficient: for one thing, we are saving the entire object as is to all the indexes. A "real world" approach could be to save the object as a scalar "string
" and its id in the various indexes. However, I could not have done this without increasing the length of my code, so I just skipped it. This has been left as an exercise for the reader.
The attached archive contains several files. We have a Person
class that allows us to save Person
class objects, with details like the person's name, gender, country of origin, and date of birth. I have deliberately kept the class simple so that we do not have to deal with irrelevant details. There is also a static
RedisAdaptor
class that has helper functions for both saving Person
objects and retrieving them.
It is recommended that you have a look at the attached archive for a more complete understanding of the code.
External Libraries
Insofar as external libraries are concerned, you will find a comprehensive list of Redis client libs for different languages/platforms here. I have used the StackExchange.Redis
lib, since it is both open source and has very friendly licensing terms, as against some of the other entries in the C# section. I have also used the NewtonSoft JSON library, available here. This latter library is also free for commercial use, though you can purchase a license should you be inclined to do so.
The Main Program
In the main program, we have a loop that runs through every single day of the arbitrarily-chosen year, 1971, and creates a new Person
class object with that day's date of birth. As for the gender, we keep alternating every single object between male and female; for the country, since only three countries have been specified in this contrived example, every single Person
is either from India, the USA, or Great Britain. For the name, we use a randomly generated string
, which is actually a guid, stripped off of all non-alphanumeric chars.
static void Main (string[] args) {
const int YEAR = 1971;
for (int month = 1; month <= 12; ++month) {
for (int day = 1; day <= 31; ++day) {
try {
string name = Util.GetAnyName ();
DateTime dob = new DateTime (YEAR, month, day);
Gender gender = Gender.FEMALE;
if (day % 2 == 0) {
gender = Gender.MALE;
}
Country country = Country.INDIA;
if (day % 3 == 1) {
country = Country.USA;
} else if (day % 3 == 2) {
country = Country.GB;
}
Person person = new Person (name, gender, country, dob);
RedisAdaptor.StorePersonObject (person);
} catch (Exception) {
continue;
}
}
}
DateTime fromDate = DateTime.Parse ("5-May-" + YEAR);
DateTime toDate = DateTime.Parse ("7-May-" + YEAR);
List<Person> persons = RedisAdaptor.RetrievePersonObjects (fromDate, toDate);
Console.WriteLine ("Retrieved values in specified date range:");
foreach (Person person in persons) {
Console.WriteLine (person);
}
List<Person> personsSelection = RedisAdaptor.RetrieveSelection (Gender.FEMALE, Country.USA);
Console.WriteLine ("Retrieved values in selection:");
foreach (Person person in personsSelection) {
Console.WriteLine (person);
}
}
The RedisAdaptor
class has a single function for storing and indexing the value passed to it, and a function each for retrieving values by date range, by gender, and by country. We also have a few static
fields and constants for this.
Note that the "indexing" is done during the storage itself (naturally). In this case, we index by gender, by country, and by date of birth.
static class RedisAdaptor {
const string REDIS_HOST = "127.0.0.1";
private static ConnectionMultiplexer _redis;
const string REDIS_DOB_INDEX = "REDIS_DOB_INDEX";
const string REDIS_MALE_INDEX = "REDIS_MALE_INDEX";
const string REDIS_FEMALE_INDEX = "REDIS_FEMALE_INDEX";
const string REDIS_C_IN_INDEX = "REDIS_C_IN_INDEX";
const string REDIS_C_USA_INDEX = "REDIS_C_USA_INDEX";
const string REDIS_C_GB_INDEX = "REDIS_C_GB_INDEX";
static RedisAdaptor () {
_redis = ConnectionMultiplexer.Connect (REDIS_HOST);
}
public static void StorePersonObject (Person person) {
string personJson = JsonConvert.SerializeObject (person);
IDatabase db = _redis.GetDatabase ();
if (person.Gender == Gender.MALE) {
db.SetAdd (REDIS_MALE_INDEX, personJson);
} else {
db.SetAdd (REDIS_FEMALE_INDEX, personJson);
}
if (person.Country == Country.INDIA) {
db.SetAdd (REDIS_C_IN_INDEX, personJson);
} else if (person.Country == Country.USA) {
db.SetAdd (REDIS_C_USA_INDEX, personJson);
} else if (person.Country == Country.GB) {
db.SetAdd (REDIS_C_GB_INDEX, personJson);
}
double dateTicks = (double) person.DoB.Ticks;
db.SortedSetAdd (REDIS_DOB_INDEX, personJson, dateTicks);
}
public static List<Person> RetrievePersonObjects (DateTime fromDate, DateTime toDate) {
double fromTicks = fromDate.Ticks;
double toTicks = toDate.Ticks;
IDatabase db = _redis.GetDatabase ();
RedisValue[] vals = db.SortedSetRangeByScore (REDIS_DOB_INDEX, fromTicks, toTicks);
List<Person> opList = new List<Person> ();
foreach (RedisValue val in vals) {
string personJson = val.ToString ();
Person person = JsonConvert.DeserializeObject<Person> (personJson);
opList.Add (person);
}
return opList;
}
public static List<Person> RetrievePersonObjects (Gender gender) {
IDatabase db = _redis.GetDatabase ();
string keyToUse = gender == Gender.MALE ? REDIS_MALE_INDEX : REDIS_FEMALE_INDEX;
RedisValue[] vals = db.SetMembers (keyToUse);
List<Person> opList = new List<Person> ();
foreach (RedisValue val in vals) {
string personJson = val.ToString ();
Person person = JsonConvert.DeserializeObject<Person> (personJson);
opList.Add (person);
}
return opList;
}
public static List<Person> RetrievePersonObjects (Country country) {
IDatabase db = _redis.GetDatabase ();
string keyToUse = REDIS_C_IN_INDEX;
if (country == Country.USA) {
keyToUse = REDIS_C_USA_INDEX;
} else if (country == Country.GB) {
keyToUse = REDIS_C_GB_INDEX;
}
RedisValue[] vals = db.SetMembers (keyToUse);
List<Person> opList = new List<Person> ();
foreach (RedisValue val in vals) {
string personJson = val.ToString ();
Person person = JsonConvert.DeserializeObject<Person> (personJson);
opList.Add (person);
}
return opList;
}
public static List<Person> RetrieveSelection (Gender gender, Country country) {
IDatabase db = _redis.GetDatabase ();
string keyToUseGender = gender == Gender.MALE ? REDIS_MALE_INDEX : REDIS_FEMALE_INDEX;
string keyToUseCountry = REDIS_C_IN_INDEX;
if (country == Country.USA) {
keyToUseCountry = REDIS_C_USA_INDEX;
} else if (country == Country.GB) {
keyToUseCountry = REDIS_C_GB_INDEX;
}
RedisKey[] keys = new RedisKey[] { keyToUseGender, keyToUseCountry };
RedisValue[] vals = db.SetCombine (SetOperation.Intersect, keys);
List<Person> opList = new List<Person> ();
foreach (RedisValue val in vals) {
string personJson = val.ToString ();
Person person = JsonConvert.DeserializeObject<Person> (personJson);
opList.Add (person);
}
return opList;
}
}
Each of the const
keys defined in this static
class, like REDIS_DOB_INDEX, REDIS_MALE_INDEX
, REDIS_FEMALE_INDEX
, and so forth, are actually keys to the individual sets in the Redis store.
Once we have saved the values and also created indexed sets for them in Redis, we can retrieve them using the various versions of the overloaded function, RetrievePersonObjects
, with the date range, gender, or country parameters.
Retrieving by gender is pretty straightforward: based on the specified gender, we dip into either of the two gender sets and retrieve the required values. So also for retrieval by country, wherein we have three unqiue sets for each of the countries specified.
To retrieve values by date range, we use the SortedSetRangeByScore
method in the Redis database object. It takes three arguments: the first is the name of the sorted set itself, while the others are the min and max values. (You can read more about it here.)
One of the more interesting features of working with sets in Redis is its splendid use of set semantics. You can specify two or more sets, and have the Redis database do a union, intersection, or a difference amongst them. In the last code section above, have a look at the function towards the bottom, RetrieveSelection
, which declares two parameters: Gender
and Country
. This particular function returns values based on both, the Person
's gender AND their country of origin. To this end, we use the SetCombine
method in the Redis database object.
Cleaning up Redis
As you keep running the program, you will soon start filling up your Redis instance's persistent memory. In case you would like to get rid of the previous values already stored in Redis, I would recommend that you call flushdb from the Redis CLI (instructions for installing Redis and using the CLI) to delete all the keys in the current database. You could also use the more perilous flushall call, however, do note that flushall deletes all keys across all databases! So I would ask you to use flushall like a loaded gun: very, very carefully.
Points of Interest
Hey, if you're new to Redis, let me add that we haven't even scratched the surface of what Redis can do for you. If this article has piqued your curiosity, I would recommend that you also check out the pub/sub functionality that comes out of the box.
Another very nifty feature is the HyperLogLog thingy available in Redis for a while now, for finding the cardinality (or member count) of any complex type. This is essentially an algorithm with O(1) complexity, and yet another advantage is that it uses a fixed amount of space, regardless of the membership size. Of course, the tradeoff is that its accuracy is a little off, albeit something that you should be able to live with.
You may also want to check out the documentation provided in a very minimal format for each data type, and a lot more, here.
History