Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Hosted-services / Azure

Understanding Block Blobs

5.00/5 (1 vote)
21 Apr 2012CPOL2 min read 26.3K  
A tip about block blobs

Introduction

The blob storage service provided 2 flavors of blob storage:

  1. Block blob
  2. Page blob

In this tip, I would like to focus on Block blob for its wide use. I'll go over the block blob concept followed by some examples for clarification.

Background

Block vs Page Blob

Block blob provides the ability to upload a large amount of data very fast. Why? Since the upload itself can be done in parallel and the size of each block can verified (up to 4MB).
In contrast, the Page blob is best optimized for random access. Meaning, I can specify an offset in the blob for reading/writing.
Each page in the page blob has a fixed size of 512KB (so the storage service can optimize the access internally).

Block Blob Concepts

If you go over the Storage Service REST APIs or the Microsoft.WindowsAzure.StorageClient namespace, you probably noticed that there
isn't any way to delete a single block. Why?
To answer this question, let's first go over some concept:

The blocks in the storage service are organized into 2 lists:

  1. Committed
  2. Uncommitted

Let's look from the Azure Storage Service perspective:

Anytime you upload a block (calling PutBlock) to the storage service, it goes directly to the Uncommitted block list.
Blocks on the Uncommitted list are saved in the storage service up to 1 week if they are not committed (participating in the Committed list).
In order to commit the block, one should call PutBlockList and specify the block ids.

So how do I delete a block?
By simply not mentioning the block id to be deleted in the list when calling PutBlockList.

How do I update a block content?
Using the same block id concept done for deleting. You upload a new block that has the same id of an old one (the blob service will take the latest block)

Using the Code

The following simple example demonstrates how to delete/update blocks from a blob.
I used the Microsoft.WindowsAzure.StorageClient APIs.

C#
const string storageName = "[Put_Storage__Name]";
const string storageKey = "[Put_Private_Key]";
CloudStorageAccount storage = CloudStorageAccount.Parse(String.Format
("DefaultEndpointsProtocol=https;AccountName={0};AccountKey={1}", storageName, storageKey));
CloudBlobClient cloudBlobClient = storage.CreateCloudBlobClient();

CloudBlockBlob blob = cloudBlobClient.GetBlockBlobReference("mycontainer/testfile.txt");

// Get committed blocks
List<string> commitedBlocks = new List<string>();
// Grab the first 100 blocks ids
commitedBlocks.AddRange(blob.DownloadBlockList(BlockListingFilter.Committed).
Take(100).Select(id => id.Name));
// Noticed the blocks not in the list will be garbage collected and deleted by the storage service
blob.PutBlockList(commitedBlocks);

// Update the first block
const String newBlockContent = "testblock";
var blockID = Convert.ToBase64String(BitConverter.GetBytes(0));
// Upload the content
blob.PutBlock(blockID, new MemoryStream(Encoding.Default.GetBytes(newBlockContent)), null);
// Commit the blocks
blob.PutBlockList(commitedBlocks);  

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)