(untagged)

Implementation of a B-Tree Database Class

_oti

0.00/5 (No votes)

29 Jun 2006

An article and source code regarding the implmentation of B-Trees in C++.

Download demo project - 22.7 Kb

Introduction

This source code (as part of the demo project) creates and uses a BTree database. It implements the following features:

create database
insert/update record
delete record
search for keys (exact match or sort-of wildcard)
traversal
sequential forward and reverse record access
user-specified fixed size records and fixed size keys
custom comparison and traversal callbacks

Background

A while ago, I started developing a system that would allow people to chart (Australian) stock market movements nicely and simply, and in a user-friendly fashion. Obviously, a large part of the client side of the system would be the storage of stock movements on the local machine. People need to be able to look at their charts without having to download large amounts of data every time they open a window. Programmatically, it makes sense if these movements can be retrieved in date order, since plotting stock movements on a chart window requires the data to be presented in date order. I needed to be able to retrieve them rapidly, too. Bear in mind that with the data package available with the download, some securities date back to 1980.

After a lot of mucking around, I eventually decided to write database functionality that I could use (not GPL'd, not hideously expensive). See the Points of Interest section for a full discussion of this process.

Contents and structure of BTreeDB

Before getting into the tree itself, let's have a look at what makes a tree.

Smart pointers

If you don't know about smart pointers, now would be a really good time to check them out. Look here for the article I used as a base for my smart pointers. What they do is let you copy pointers around, manage their own reference counting, and ensure that objects are deleted when the last reference to the object is released or goes out of scope.

The objects internal to the database use smart pointers a lot because ..... well, it's just so darn convenient.

The DbObj class

A DbObj is a generic chunk of data. To create one, you invoke a constructor, providing a pointer to some data and the data size. There are a few other constructors that allow you to create DbObjs of various types without having to worry about sizes. These include constructors for std::strings, numbers, and other DbObjs. Note that these objects make a copy of your data, so you can't expect to create the DbObj, change your original data, and expect the changes to be reflected in the DbObj. Note also that the DbObj doesn't contain any information about the type of data stored.

The code deals with smart pointers to DbObj objects. You create DbObjs, pass them to the BTreeDB, and forget about them. Or, get one back from the BTreeDB and don't worry about explicitly deallocating it. The reference counting infrastructure takes care of it.

DbObj objects are used to specify data that is to be inserted into the BTreeDB, and also keys that are used to look up records in the BTreeDB.

In all the sample code below, I'm working with a record structure that is made up of the following:

const size_t keyLen = 12;
const size_t dataLen = 28;

struct SampleKey
{
    unsigned char key[keyLen];
};
struct SampleData
{
    unsigned char data[dataLen];
};
struct SampleRecord
{
    SampleKey key;
    SampleData data;
};

To create a DbObj containing objects like this, you would do the following:

SampleRecord sr;
memset(&sr.key, 'K', keyLen);
memset(&sr.data, 'D', dataLen);

DbObjPtr pRec = new DbObj(&sr, sizeof(SampleRecord));

The TreeNode class

A BTree contains a multi-level hierarchical collection of tree nodes (i.e., a tree). These nodes contain a series of records, and references to child nodes. The records in the node are stored as DbObjs as mentioned above, while the children are stored as a list of smart pointers to TreeNode objects.

These are the chunks that get read from and written to the disk. They aren't loaded until they are actually required.

Working with the BTreeDB

Creating a BTreeDB

To create a BTreeDB object, you invoke the constructor (duh!):

BTreeDB db(fileName, sizeof(SampleRecord), sizeof(SampleKey), minDegree);

The first parameter (fileName) is a const char* that points at a buffer containing the name of the file. If the file doesn't exist, it's created. If the file does exist, it's going to be used as a database file containing valid data. Note that the constructor doesn't open the file. It just initializes the data members. The second parameter is the size of the record (in bytes) to be stored in the database. The third parameter allows you to specify how many of these bytes are to be interpreted as the key. The fourth parameter is the tree's minimum degree, usually referred to in the text as t. A tree node can hold as many as 2t - 1 keys, but may hold no fewer than t.

Not shown above is a fifth parameter for the constructor. You can provide your own comparison function if your records require something other than a binary comparison when it comes to sorting. The default comparison is a simple memcmp using the first keyLength bytes of the two data objects being compared.

Opening a BTreeDB

db.open()

If the file given in the constructor doesn't exist, it is created and initialized to handle the record and key sizes provided in the constructor. If the file already exists, the values provided are compared with what is already in the file, and the database doesn't open if they disagree.

This method will return false if it fails to open the database for any reason.

Reasons for failure include:

database file locked by another process.
size and minDegree parameters not provided in the constructor and file doesn't yet exist.
inability to create the file if creation is required.
inability to read header information from the file if the file already exists.

Yes, I could have made the open function return an error code, but I didn't. Maybe later.

Inserting data into a BTreeDB

This is just a case of creating DbObjs that you want to insert, and then inserting them.

for (int i = 0; i < iterLimit; i++)
{
    SampleRecord sr;
    makeRecord(&sr);    // This just fills in the sample record

    
    DbObjPtr pObj = new DbObj(sr, sizeof(SampleRecord));
    db.put(pObj);
}

You'll have to make sure that you provide a record that is the correct size. The put method will return false if this is not the case.

This method is also used for updating existing records. Note that the BTreeDB doesn't allow you to insert multiple records that have the same key. If you "put" any record with the same key as an existing record, the existing one will be overwritten.

Retrieving data from a BTreeDB

Create a DbObj containing the key, and then ask for the record.

SampleKey sk;
memset(&sk, 'K', sizeof(SampleKey));

DbObjPtr pKey = new DbObj(&sk, sizeof(SampleKey));
DbObjPtr pRec;
if (db.get(pKey, pRec))
{
    SampleRecord sr;
    assert(pRec->getSize() == sizeof(SampleRecord));
    memcpy(&sr, pRec->getData(), pRec->getSize());
}

Removing data from a BTreeDB

For all the time it took to get this functionality right, using it is quite simple.

SampleKey sk;
memset(&sk, 'K', sizeof(SampleKey));

DbObjPtr pKey = new DbObj(&sk, sizeof(SampleKey));
db.del(pKey);

Other operations on a BTreeDB

There are examples in the code about how to traverse the BTreeDB, how to do "like" searches (where the search key is smaller than the actual record key size), moving forwards and backwards through the tree, and a few other things possibly unique to my situation.

BTreeDB wrappers

I use wrappers to provide structure to a given database. (I don't know where I found the Singleton class ... [have to insert link here]).

class HistoryDB
{
    static const int _dbMinDegree = 49;
    friend class Singleton<HistoryDB>;

public:
    struct HistoryKey
    {
        byte exchange;
        char security[10];
        ushort moveDate;
        byte padding[3];
    };

    struct HistoryItem
    {
        int openPrice; // Opening price in currency units 

                       // (implied by Exchange context)

        int highPrice; // High price in currency units 

                       // (implied by Exchange context)

        int lowPrice;  // Low price in currency units 

                       // (implied by Exchange context)

        int closePrice; // Closing price in currency units 

                        // (implied by Exchange context)

        llong tradeVolume; // Volume of trades for the trading day

        byte intraday;   // 0 = this is an EOD value, 

                         // 1 = intraday progressive value.

        byte padding[7];
    };

    struct HistoryRecord
    {
        struct HistoryKey key;
        struct HistoryItem item;
    };

private:
    static int _compareHR(const DbObjPtr& pObj1, 
                          const DbObjPtr& pObj2)
    {
        HistoryKey* k1 = (HistoryKey*)pObj1->getData();
        HistoryKey* k2 = (HistoryKey*)pObj2->getData();
        int ret = k1->exchange != k2->exchange;
        if (ret == 0)
        {
            ret = strcmp(k1->security, k2->security);
            if (ret == 0)
            {
                ret = k1->moveDate - k2->moveDate;
            }
        }
        return ret;
    }

private:
    HistoryDB(void)
    {
        GlobalData& gd = GlobalDataS::instance();
        string fileName = gd.getDataDir() + "history.db";
        _db = new BTreeDB(fileName,
            sizeof(HistoryRecord),
            sizeof(HistoryKey),
            _dbMinDegree,
            _compareHR);
        if (!_db->open())
        {
            _db = (BTreeDB*)0;
        }
    }

public:
    BTreeDBPtr& getDB() { return _db; }

private:
    BTreeDBPtr _db;
};
typedef Singleton<HistoryDB> HistoryDBS;

Given the class above, I can do the following:

HistoryDB& hdb = HistoryDBS::instance();
BTreeDBPtr pdb = hdb.getDB();
HistoryDB::HistoryRecord hr;
DBOBJVECTOR dbMoves;

memset(&hr, 0, sizeof(hr));
strcpy(hr.key.security, stockCode);
DbObjPtr pKey = new DbObj(&hr, sizeof(hr));
NodeKeyLocn locn = pdb->search(pKey);
if ((TreeNode*)locn.first != 0)
{
    DbObjPtr pRec;
    while (pdb->seq(locn, pRec))
    {
        memcpy(&hr, pRec->getData(), 
               sizeof(HistoryDB::HistoryRecord));
        if (0 != memcmp(pRec->getData(),
            pKey->getData(),
            sizeof(hr.key.exchange) + sizeof(hr.key.security)))
        {
            break;
        }
        dbMoves.push_back(pRec);
    }
}

Caveats

This is not thread-safe code ... I protect access to the database using my own mutex in my application.

You need to be aware of when you open and close databases, and when they are flushed to disk. There are aspects of this that I haven't fully explored, but I know that I should be releasing memory (TreeNodes) more frequently than I do. If this is not done, you'll end up with the entire database in memory. While it's really good for performance on small databases, it may not be appropriate for larger databases.

At present, it suits what I'm trying to do.

Points of interest

My first attempt at implementing a database simply had one file per security, with the data stored sequentially. For ANZ, there was a file named ANZ.fdb; for CBA, there was a file named CBA.fdb, and so on. That was fine, since there are less than a couple of thousand securities on the ASX (Australian Stock Exchange). As the project grew, and I started incorporating options, I had to start putting these in their own subdirectories named for the underlying security. That was starting to look really ugly. I imagined that I would run into even more trouble when it came to storing warrants and intraday movements for a stock.

I looked around for alternatives. MySQL has an embedded database engine that is really fast, but this would have cost a ridiculous amount of money for a royalty-free distribution license, or a determination that I would release my entire project as open source. Hmmm ... not yet. Sleepcat has Berkeley DB, which is a very fast and efficient embedded database engine, so I had a look at that. Again, a lot of money, or open source. They do have an old version (1.8.5) that is released under the BSD distribution license and completely free. I played around with that for a bit, but as various sites state, there are some well known bugs with it. I didn't really know enough about the project to fix them. Just FYI: if you have a need for an embedded database, and you meet the open source criteria (or personal use, blah blah blah...), go with one of these packages :-)

So I was stuck. I didn't want a bazillion files floating around in the user's file system, but I wanted the ability to add many records over time, and retrieve them quickly. I needed some sort of database package, couldn't find one, so eventually I decided to do it myself. Fine. So where to start? Back to my days of University, and the second year Data Structures and Algorithms class. The text for this class was Cormen, Leiserson, and Rivest's excellent Introduction to Algorithms, 2^nd Edition, MIT Press, Mass., 1990. Although we didn't study B-Trees at the time, I remember trying to implement one a bit later in an abortive attempt to break into the entertainment industry. Long story ... don't ask. This time, though, I was determined to make it work.

Oh ... why use a B-Tree? Well, they're kind of nice to use, since they allow storage of large amounts of data on disk, indexed by key, with records accessible in roughly O(log n) time where n is the number of nodes in the tree. There are many places you can find out more about B-Trees. I'm not inclined towards reproducing large slabs of the text here, but you could look here if you are interested. It includes the basic algorithms for creating, searching, and inserting records from Cormen, Leiserson, and Rivest.

Back to the task at hand, I knew that deletion would be a problem, since no-one publishes pseudo-code for the deletion of keys from a B-Tree. It appears to be a bit more complex than the other operations, and is generally "left as an exercise for the reader". Grr. So I researched other textbooks, and I searched the net. Eventually, I found a paper entitled "Implementing Deletion in B+-Trees" by Jan Jannick of Stanford University. To quote from the abstract: "There are published algorithms and pseudo-code for searching and inserting keys, but deletion, due to its greater complexity and perceived lesser importance, is glossed over completely or left as an exercise to the reader. To remedy this situation, we provide a well documented flowchart, algorithm, and pseudo-code for deletion, their relation to search and insertion algorithms, and a reference to a freely available, complete B+-tree library written in the C programming language." Yay! So I downloaded the code referred to in the paper (you can get them starting from here). Except that, the C code didn't compile in VS7, and the C++ code broke after a couple of insertions. Bugger.

So, I built the creation, insertion, and search code, etc., using the text's pseudo-code. Unable to find any deletion pseudo-code elsewhere, I decided that I would just have to bite the proverbial bullet and write the code based on the narrative provided. After a long time, everything worked. I decided that no-one should have to go through that again, so I'm making the code available for public distribution. This may frustrate many university lecturers since there now exists code that actually does this wretched delete, but somehow ... I struggle to care. Victory is mine!

Addendum: Everything didn't work. Close, but no cylinder smoking thing. Deletion still has a problem. There are three different deletion orders in the test code: ordinary alphabetic, reverse alphabetic, and something else. The "something else" will sometimes result in not all items being deleted. I don't have the time to look at this now. Maybe, when I have a product that is keeping me and my family in the manner to which we want to become accustomed ...

History

09-Jun-2004: Initial release.
12-Jul-2004: Included timestamps in demo, fixed reverse iteration code.
28-Jun-2006: Included fixes suggested by AdamWynne and steradrian.

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here