(untagged)

Managing Blobs Using the ATL OLE DB Consumer Templates on the Pocket PC

Joao Paulo Figueira

0.00/5 (No votes)

30 Jun 2004

Managing large data types on SQL CE databases and the ATL OLE DB Consumer Templates.

Sample Image - atl_ole_db_blob_ppc.jpg

Introduction

This is the second article on using the ATL OLE DB Consumer Templates on the Pocket PC. In the first article, I described how the ATL OLE DB Consumer Templates can be adapted and used on Pocket PC platforms. The article provided a simple example of managing data in a SQL Server CE 2.0 database, and is the foundation for the material provided here.

Simple examples are not what you find on your everyday business. Apart from all of the other shortcomings you might find on the article’s code, one was flagrant: there was no provision for handling large data types, also known as blobs (binary large objects).

The Need for Large Data

We cannot just make them go away: we actually need blobs in our databases. Customers will require that their databases handle notes, pictures, or other arbitrary binary data. While you can think of other means of handling this, like a mixed scenario of database and file store, you will loose the flexibility and simplicity of replicating the data, especially when using Microsoft’s data replication mechanisms. So, it’s better to have the blobs in the database.

But there is another less obvious reason for having to handle blobs: database limitations and data replication conversions. When you replicate a desktop database into SQL CE, some data conversions are enforced, namely in what concerns ANSI to UNICODE conversions and, most notably, circumventing the nvarchar size limit of 255 characters. If the desktop database table has a column with, say, nvarchar (512), it will be automatically converted to ntext (a blob) on the Pocket PC. This applies to SQL Server CE 2.0, but may change in the future.

So, if we have to live with them, how do we manage blobs?

Managing Large Data

Typically, large data fields are managed by OLE DB through storage objects. Data is not presented to you in one complete block, like in the smaller data types. Instead, the OLE DB provider gives you a storage object interface pointer that you use to manage this large data. With this object, you can manipulate the blob data, reading it into your application’s variable or writing from the application to the blob. Let’s see how it is done.

Storage Object Types

SQL Server CE supports two different storage objects: ISequentialStream and ILockBytes. As its name suggests, the first storage object reads and writes blob data in a sequential fashion. Data is read and written in sequential chunks, allowing you to conserve program memory: you don’t have to create a full blob image in memory in order to use it. The ILockBytes storage object allows you to do the same thing but uses a random access approach.

In this article, I will use only the ISequentialStream object. Using ILockBytes may be left as an exercise to the reader.

Binding Blobs

Before you can read from and write to blobs, they will have to be bound through an accessor, just like any ordinary column. The standard implementation of CDynamicAccessor will do this for you on the BindColumns method. The method will test for the presence of a blob using the following condition:

if (m_pColumnInfo[i].ulColumnSize > 1024 || 
          m_pColumnInfo[i].wType == DBTYPE_IUNKNOWN)

Generally, blobs will be marked as having a size much larger than 1024 bytes, so the type test is not relevant. If this test is true, the method will create a DBOBJECT object specifying that data will be managed through an ISequentialStream pointer.

This means that instead of having a full and explicit representation of the data such as the case of non-blob columns, the accessor buffer will store an ISequentialStream pointer. When reading data, this pointer will be automatically created by the provider, leaving to the consumer the responsibility to release it. When writing data, the situation is the exact opposite. The consumer creates an ISequentialStream object and provides a pointer to it. The provider uses this pointer to read data and releases it.

So, the blueprint for reading and writing blobs does seem to be quite straightforward. Instead of accessing a copy of the data directly, you have to go through a storage object pointer that is either created by the provider or by the consumer, depending if the data is being read or written. But, as usual, life’s not so simple. This scheme does not work if we have to manage more than one blob per table or query. Why? Read on.

SQL Server CE 2.0 Provider Limitations

The above method for binding blobs does not work on the SQL Server CE 2.0 OLE DB provider because it is only able to serve one storage object at a time. So, if you use the default BindColumns behavior, you will not be able to bind more than one blob column per rowset. This limitation is revealed by the DBPROP_MULTIPLESTORAGEOBJECTS property which, on this particular provider, is read-only and set to false.

My first attempt at working around this problem was to bind the blob fields by reference. This method is used on the newer (7.1) version of desktop ATL and does work with most providers (SQL Server 2000 and Jet 4.0 do support it). And you know what? This type of binding is also not supported by SQL Server CE 2.0.

So, we need to work around this. After a little research, the solution became apparent, although not necessarily simple.

Using Multiple Accessors

The solution for binding multiple blobs per rowset is to use multiple accessors. If you carefully read the code for CDynamicAccessor, you will see that it only uses one accessor handle independently of the circumstances. My idea to solve this problem is to allow for more accessor handles per rowset, using a simple distribution: we use the first accessor to bind all non-blob columns, and all subsequent accessors to bind a blob column each. So, for instance, if we have a rowset with 5 regular columns and 2 blob columns, we will have to use 3 accessor handles when binding.

These changes are easily achieved by changing the BindColumns behavior on the derived CSmartAccessor class. If you look at the code (included on sample – too long to reproduce here), you will see two column binding loops. On the first, all non-blob columns are bound using the first accessor handle. The second loop will bind all blob columns using its own accessor handle. Note that none of the blob accessors is marked as an auto-accessor, meaning that when fetching a new row (using MoveNext for instance), data for the blob columns will not be automatically retrieved. You will have to do it on a column by column basis, storing the blob data on application-provided memory.

So let’s see all of this in action.

Reading

Whenever the row position is moved, the underlying data is automatically fetched from the provider into the accessor buffer. This is the default implementation found on the Consumer Templates which, allow me to remind you, will only use one accessor handle for the whole row.

Our solution implies the use of extra accessor handles, one for each blob column, whose data will have to be loaded on demand due to the limitation of the SQL CE provider. So, while non-blob data will be immediately available after the row position is updated, blob data will have to be explicitly loaded through a storage object, and this object must be explicitly disposed before any other blob is read.

Now, this poses a very simple operational problem. You see, in order to explicitly load the blob column data through the accessor handle, we have to know what accessor handle to use. The BindColumns method has done all the accessor allocation job for us, so let us see how it was done and how we can find what is the accessor handle for any given blob column.

Accessor Allocation

Before the allocation process starts, the method counts the number of blob columns and accessor handles. Here is the code (please follow this discussion using the sample code provided with this article):

nblobs = 0;
for(i = 0; i < m_nColumns; ++i)
    if(m_pColumnInfo[i].ulColumnSize > m_nblobSize 
              || m_pColumnInfo[i].wType == DBTYPE_IUNKNOWN)
        ++nblobs;
nAccessors = nblobs + 1;

Below this code, you can see that a new DBBINDING array is allocated just for the blob columns. After this, there is a loop that will bind all non-blob columns to the first accessor handle. This code is pretty much what you can find on the default implementation in CDynamicAccessor.

The code that follows, binds all blob columns to their own accessor handle. Accessor handles are stored in an array whose index is stored on the bPrecision member of the DBCOLUMNINFO structure. This is where we come looking for the accessor handle index.

m_pColumnInfo[i].wType        = DBTYPE_IUNKNOWN;
m_pColumnInfo[i].ulColumnSize = sizeof(IUnknown*);
m_pColumnInfo[i].bPrecision   = ++iAccessor; // Accessor number for this blob
m_pColumnInfo[i].bScale       = 0;

To make life easier for the developer, I have included two methods named GetBlobAccessor that will return the accessor handle index for any given blob column. Seems complex? Not really, let us see a code sample.

Code Sample

The following code shows how a blob text field is loaded into a CString variable.

//
// Retrieve the description ntext field
//
if(table.GetblobAccessor(_T("Description"), &nAccessor))
{
    HRESULT    hr;

    hr = table.GetData(nAccessor);
    if(FAILED(hr))
        return hr;
}
table.Get(_T("Description"), m_strDesc);

The releasing of the storage object is performed inside the Get method.

Inserting

When it comes to inserting new data or updating existing data, we have to deal with a new and interesting problem. In the previous discussion, I said that the storage objects used to transfer data around are created by the provider when data is read, and are created by the consumer when data is written. So, we need to create a COM storage object, feed it with the blob data, and supply its pointer to the provider. This is not an easy task, right?

Thankfully, I had the help from a Microsoft sample to sort this one out.

The CBlobStream Class

When I was researching for this article, I came across a sample on MSDN named AOTBLOB. This small program shows you how you can easily solve the writing problem using a helper class: CISSHelper. I adapted most of the code when I wrote the CBlobStream class, but added a little feature: the Release method now works as expected: it deletes the object using one of my favorite lines of C++ code:

ULONG CblobStream::Release()
{
    if(m_nRef)
    {
        --m_nRef;
        if(m_nRef == 0)
        {
            delete this;
        }
    }
    return m_nRef;
}

If you look at the sample code, you will see that the reference counting mechanism is in place and the first increment is done right inside the constructor.

As you can see, the class itself derives from ISequentialStream, so you can actually feed an object pointer to the provider and it will work. The provider will even call Release for you, so you should be careful not to statically allocate such an object.

Now, let us see how the class works. The following code is taken from the CSmartAccessor class, namely from the blob part of the string version of _set_value:

IStream*        pStream;
DBSTATUS        dbStatus;
ULONG           nLength,
                nActual;
CBlobStream*    pBlob = NULL;

dbStatus = _get_status(nColumn);

if(dbStatus == DBSTATUS_S_OK)
{
    //
    // Release the existing stream
    //
    pStream = *(IStream**)_GetDataPtr(nColumn);
    if(pStream)
        pStream->Release();
}

//
// Create a new stream
//
nLength = wcslen(pszText) * sizeof(TCHAR);

pBlob = new CBlobStream;
if(pBlob)
{
    HRESULT hr;

    hr = pBlob->Write(pszText, nLength, &nActual);
    if(SUCCEEDED(hr))
    {
        *(CBlobStream**)_GetDataPtr(nColumn) = pBlob;
        _set_status(nColumn, DBSTATUS_S_OK);
        _set_length(nColumn, nLength);
        bOk = true;
    }
    else
    {
        _set_status(nColumn, DBSTATUS_S_ISNULL);
        _set_length(nColumn, 0);

        delete pBlob;
    }
}

First of all, we have to make sure that the storage object created by the provider is released. Note how its presence is tested (status and pointer).

Next, we create the CBlobStream object and populate it with the string data using the Write method. Finally, if the last operation succeeded, we write the pointer on the accessor buffer, thus sending it over to the provider. Note that the object is not deleted nor released because this is the provider’s responsibility.

One thing is missing, though. How do we insert a row with blobs?

The Insert Procedure

Inserting a row is a bit different when multiple blobs (and accessors) are used. Essentially, we have to use the Insert method on the first accessor (the one that is used to bind the non-blob columns) and request the new row handle. Using this row handle, we can then set all blob columns. Let’s see some code:

//
// Use Insert on the non-BLOB accessor, and SetData on all BLOB accessors
//
hr = table.Insert(0, true);
for(i = 1; i < table.GetNumAccessors() && SUCCEEDED(hr); ++i)
    hr = table.SetData(i);

It’s that easy.

Updating

I kept the best part to the end. Updating data does not require any changes to the code because the SetData method will use all accessor handles. We are done.

Sample Application

The sample application included is an extension of the one presented on the first article. It now allows you to edit the categories table where each row has two blobs: the description and the image.

During the development of this sample, some small changes were made to the atldbcli_ce.h file. These changes are described here.

Note: As I mentioned on the first article, this file is not made available here due to possible Microsoft copyright restrictions.

Code Changes

Open the atldbcli_ce.h file that you created using the instructions given on the first article, and look for CDynamicAccessor. On the member variables declaration, add:

ULONG m_nBlobSize;

Now, go to the constructor and add the following line:

m_nBlobSize = 1024;

Now, if you like, you can replace the occurrence of 1024 on the BindColumns method with m_nBlobSize, although it is not a requirement. This change is necessary to make the new version of CSmartAccessor compile correctly.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here