Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / ASP.NET

Some tips and tricks for azure table storage

4.11/5 (2 votes)
21 Oct 2013CPOL2 min read 19.6K  
This part of the article walks you through tips for fetching entities quickly.

Introduction

Do you play around with your applications' diagnostic logs and and the logs take up eternity to show up?

Well of course searching by partition key and row id does the trick, but what about the center of the storm WADLogsTable? Doesn’t fetching the last 5 min’s logs takes painfully longer as the number of logs keep increasing? We’ll try to solve the problem here.

So, we are playing with the WADLogsTable in this article, where all the diagnostic data from the application goes in.

To view these logs and to delete them is a very time consuming effort isn’t it? In fact, sometimes (depending upon the number of logs written by the application), to delete an hour of logs can take an hour itself.

First we have to fetch the logs in the specific time period, and then issue the delete command for each entity. The image below shows a sample entity in WADLogsTable.

Image 1

Now, if I want to see all logs from today, should I filter based on timestamp which looks obvious? Let’s see what happens when we query entities based on timestamps.

Since Azure Table Service is derived from WCF DataServices, hence, all the query string filtering features are available here as well. So, to fetch today’s logs, my request url should look like this:

{your azure account root url}/WADLogsTable()?$filter=Timestamp ge datetime'2013-10-06T00:00:00'

Below is a screenshot from fiddler, where it shows, that querying by timestamp simply results into complete table scan to match the filter. You can see, multiple get requests were triggered each hitting a separate partition key with the given filter (today’s logs)

Image 2

Below is the code snippet used:

if (!String.IsNullOrEmpty(TableQuery))
{
    entities = (from entity in tableServiceContext.CreateQuery<GenericEntity>(SelectedTableName)
              .AddQueryOption("$filter", "Timestamp ge datetime'"+DateTime.Today.ToString("yyy-MM-ddTHH:mm:ss",DateTimeFormatInfo.InvariantInfo)+"'")
                             select entity);
               }

And it took over 7 minutes to fetch the day’s logs. This could vary depending on the number of loggings done by the application, because this compares each entity in the table.

Now, taking a better look at the partition key, will reveal that, the partition key in WADLogsTable is nothing but the number of “ticks” of the time when the log was generated. It has a minute’s precision, so, logs generated every minute, stay in the same partition. If we query our entities on partition, its exponentially faster. Let us query today’s logs based on partitions.

How to find which partition my logs are in?

Just convert today’s DateTime into its Ticks. That will be the first partition where the logs generated today at midnight will be stored.

As you can see in the below screen capture, querying by partition key issued only three get requests and the entities were fetched in less than a minute.

Image 3

Code snippet:

if (!String.IsNullOrEmpty(TableQuery))
{
    entities = (from entity in tableServiceContext.CreateQuery<GenericEntity>(SelectedTableName)
                              .AddQueryOption("$filter", "PartitionKey ge '0" + DateTime.UtcNow.AddMinutes(-5).Ticks + "'")
                select entity);
} 

I have taken the base code from codeplex and improved the fetching and deleting algorithms.

Credits

Its worth mentioning Neudesic, who donated the base tool.

Also, the research done by team of Cerebrata Cloud Storage Studio has been immensely helpful. 

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)