Introduction
Do you play around with your applications' diagnostic logs and and the logs take up eternity to show up?
Well of course searching by partition key and row id does the trick,
but what about the center of the storm WADLogsTable? Doesn’t fetching
the last 5 min’s logs takes painfully longer as the number of logs keep increasing? We’ll try to solve the problem here.
So, we are playing with the WADLogsTable in this article, where all the diagnostic data from the application goes in.
To view these logs and to delete them is a very time consuming effort
isn’t it? In fact, sometimes (depending upon the number of logs written
by the application), to delete an hour of logs can take an hour itself.
First we have to fetch the logs in the specific time period, and then
issue the delete command for each entity. The image below shows a
sample entity in WADLogsTable.
Now, if I want to see all logs from today,
should I filter based on timestamp which looks obvious? Let’s see what happens
when we query entities based on timestamps.
Since Azure Table Service is derived from
WCF DataServices, hence, all the query string filtering features are available
here as well. So, to fetch today’s logs, my request url should look like this:
{your azure account root url}/WADLogsTable()?$filter=Timestamp
ge datetime'2013-10-06T00:00:00'
Below is a screenshot from fiddler, where
it shows, that querying by timestamp simply results into complete table scan to
match the filter. You can see, multiple get requests were triggered each
hitting a separate partition key with the given filter (today’s logs)
Below is the code snippet used:
if (!String.IsNullOrEmpty(TableQuery))
{
entities = (from entity in tableServiceContext.CreateQuery<GenericEntity>(SelectedTableName)
.AddQueryOption("$filter", "Timestamp ge datetime'"+DateTime.Today.ToString("yyy-MM-ddTHH:mm:ss",DateTimeFormatInfo.InvariantInfo)+"'")
select entity);
}
And it took over 7 minutes to fetch the
day’s logs. This could vary depending on the number of loggings done by the
application, because this compares each entity in the table.
Now, taking a better look at the partition
key, will reveal that, the partition key in WADLogsTable is nothing but the
number of “ticks” of the time when the log was generated. It has a minute’s
precision, so, logs generated every minute, stay in the same partition. If we query our entities on partition, its
exponentially faster. Let us query today’s logs based on partitions.
How to find which partition my logs are in?
Just convert today’s DateTime into its
Ticks. That will be the first partition where the logs generated today at
midnight will be stored.
As you can see in the below screen capture,
querying by partition key issued only three get requests and the entities were
fetched in less than a minute.
Code snippet:
if (!String.IsNullOrEmpty(TableQuery))
{
entities = (from entity in tableServiceContext.CreateQuery<GenericEntity>(SelectedTableName)
.AddQueryOption("$filter", "PartitionKey ge '0" + DateTime.UtcNow.AddMinutes(-5).Ticks + "'")
select entity);
}
I have taken the base code from codeplex and improved the fetching and deleting algorithms.
Credits
Its worth mentioning Neudesic, who donated the base tool.
Also, the research done by team of Cerebrata Cloud Storage Studio has been immensely helpful.