Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / web / ASP.NET

Some tips and tricks for azure table storage

4.80/5 (2 votes)
21 Oct 2013CPOL 8.4K  
Leveraging .net framework for delete azure entities faster

Introduction

This was all about fetching the data quickly. Let’s walk through the options available for deleting the data in a jiffy.

When we have a never ending list of entities and we want to delete most of it, going sequentially can be really expensive. That will make 1 url with delete verb for each entity and each delete request takes around 17sec to respond successfully, so it could take more than an hour to delete an hour’s logs.

What we can do is, divide the whole list of entities into smaller lists holding entities belonging to the same partition together.

var partitions = entities.Distinct(new GenericEntityComparer()).Select(p => p.PartitionKey);
IEnumerable<IEnumerable<GenericEntity>> ChunksOfWork = null;
foreach (string partition in partitions)
{
    var ThisPartitionEntities = entities.Where(en => en.PartitionKey == partition).ToList();
} 

Then, chunk these partition specific lists into chunks of 100 entities. Why 100? Because that’s the upper limit on the number of operations allowed per batch. Rules of the game.

var partitions = entities.Distinct(new GenericEntityComparer()).Select(p => p.PartitionKey);
IEnumerable<IEnumerable<GenericEntity>> ChunksOfWork = null;
foreach (string partition in partitions)
{
    var ThisPartitionEntities = entities.Where(en => en.PartitionKey == partition).ToList();
    if (ChunksOfWork != null)
        ChunksOfWork = ChunksOfWork.Union(ThisPartitionEntities.Chunk(100));
    else
        ChunksOfWork = ThisPartitionEntities.Chunk(100);
}  

public static class IEnumerableExtension
{
    public static IEnumerable<IEnumerable<T>> Chunk<T>(this IEnumerable<T> source, int chunksize)
    {
        while (source.Any())
            {
                yield return source.Take(chunksize);
                source = source.Skip(chunksize);
            }
        }
    
    } 

Create a context and attach all these entities in a single chunk to the context and delete trigger the delete request in batch.

TableServiceContext tsContext = CreateTableServiceContext(tableClient);
foreach (GenericEntity entity in chunk)
{
    tsContext.AttachTo(SelectedTableName, entity,"*");
    tsContext.DeleteObject(entity);
}
tsContext.SaveChangesWithRetries(SaveChangesOptions.Batch); 

To make it faster, we can trigger the requests for each partition, in parallel using .net framework’s “Parallel” class. This is because operation going on in each partition is independent of the other one and batch operations could be done on one partition each batch.

//   const bool forceNonParallel = true;
//   var options = new ParallelOptions { MaxDegreeOfParallelism = forceNonParallel ? 1 : -1 };
Parallel.ForEach(ChunksOfWork, chunk =>
{
    TableServiceContext tsContext = CreateTableServiceContext(tableClient);
    foreach (GenericEntity entity in chunk)
    {
         tsContext.AttachTo(SelectedTableName, entity,"*");
         tsContext.DeleteObject(entity);
    }
    tsContext.SaveChangesWithRetries(SaveChangesOptions.Batch);
});

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)