Implementing a Document Database Engine using #liveDB in Less than 30 Minutes?

Robert Friberg

4.00/5 (1 vote)

14 Dec 2011MIT3 min read

7.8K

Implementing a Document Database Engine using #liveDB in less than 30 minutes

A prevalence engine supports two types of transactions, commands and queries. This is a very simple and flexible generalization which can be utilized for persistence using several different approaches. One way is to define a domain-specific business model together with a set of commands, for example PlaceOrderCommand, ChangeCustomerAdressCommand and so on. This approach works very well with object-oriented domain modeling. One major benefit is that no mapping is required between the object-oriented model and a separate database model. The in-memory object-oriented model IS the database.

Sometimes we want a more generic way of representing and accessing data and this can be achieved by creating a database-oriented model and commands specific for that type of database. Most databases are modified using a small set of operations specific to the type of database used.

Relational databases have insert, update and delete commands
Key/values stores and document databases have put and delete operations
graph databases have commands like CreateNode, SetProperty, CreateAssociation, etc

Any of these can be implemented based on a prevalent system architecture. And doing so would be much simpler because persistence and concurrency is already taken care of. Also, you get the extreme performance benefit of an in-memory database model.

As a proof of concept, we designed and implemented a barebones document oriented database engine based on #liveDB in less than an hour. It's a bit unpolished and has just the essential features but hopefully serves as a good demonstration. Here’s the in-memory model which derives from LiveDomain.Core.Model:

[Serializable]
public class DocumentModel : Model
{
    private readonly Dictionary<object,object> _documents;

    public DocumentModel()
    {
        _documents = new Dictionary<object, object>();
    }

    public T Get<T>(object key)
    {
        return (T) Get(key);
    }

    public object Get(object key)
    {
        return _documents.ContainsKey(key) ? _documents[key] : null;
    }

    public void Put(object key, object document)
    {
        _documents[key] = document;
    }

    public bool Remove(object key)
    {
        return _documents.Remove(key);
    }

    public T ExecuteQuery<T>(Func<Dictionary<object, object>, T> query)
    {
        return query.Invoke(_documents);
    }
}

A document database contains a set of documents, each identified by a unique key. This example implementation uses a Dictionary<object,object> to hold the documents. A richer implementation might use separate collections for different types of documents, might help you generate unique keys when adding documents, use an ID property instead of explicitly requiring you to supply a key, but remember this is just a simple POC for educational purposes.

Commands

Two commands are needed to manipulate the model, one to add/replace and one to remove documents. Commands derive from LiveDomain.Core.Command and are written to the command journal when executed providing a complete history of events and a way to restore the in-memory model when the prevalent engine is loaded. The Put-command has no return value, the Remove-command will return false if no such key was found. An additional command to update documents (apply an action) based on some predicate would be very useful, but that’s for later.

[Serializable]
public class PutCommand : Command<DocumentModel>
{
    public readonly object Key;
    public readonly object Document;

    public PutCommand(object key, object document)
    {
        Key = key;
        Document = document;
    }

    protected override void Execute(DocumentModel model)
    {
        model.Put(Key, Document);
    }
}

[Serializable]
public class RemoveCommand : CommandWithResult<DocumentModel, bool>
{
    public readonly object Key;
    public RemoveCommand(object key)
    {
        Key = key;
    }

    protected override bool Execute(DocumentModel model)
    {
        return model.Remove(Key);
    }
}

Querying

Querying is very simple, just pass a Func<> that takes a dictionary and returns whatever you like. Querying by anything other than key will run in time linear to the number of documents, or O(N). This won’t scale well but you can handle a surprising number of documents before performance becomes an issue. Adding support for indexing will improve this to O(log N) or O(1) using binary trees or dictionaries respectively.

Wrapping It Up

Last, a database engine class that encapsulates the prevalence engine and commands providing a simple document database interface.

public class DocumentDatabaseEngine
{
    private readonly Engine<DocumentModel> _engine;

    public DocumentDatabaseEngine(string location)
    {
        _engine = Engine.LoadOrCreate<DocumentModel>(location);
    }

    public DocumentDatabaseEngine() : this(String.Empty)
    {
              
    }

    public object Get(object key)
    {
        return _engine.Execute(db => db.Get(key));
    }

    public T Get<T>(object key)
    {
        return _engine.Execute(db => db.Get<T>(key));
    }

    public T Get<T>(Func<Dictionary<object,object>,T> query )
    {
        return _engine.Execute(db => db.ExecuteQuery(query));
    }

    public void Put(object key, object document)
    {
        var command = new PutCommand(key, document);
        _engine.Execute(command);
    }

    public bool Remove(object key)
    {
        var command = new RemoveCommand(key);
        return _engine.Execute(command);
    }
}

Start Your Engines!

Now we can do stuff like this:

public static int CountCustomersQuery(Dictionary<object,object> docs)
{
    return docs.OfType<Customer>().Count();
}

public static void Main(string[] args)
{
    //Loads or creates a new database at the default location
    DocumentDatabaseEngine db = new DocumentDatabaseEngine();

    //Create a document
    Customer customer = new Customer() {Name = "Rad Robs Surf Shop"};
    customer.Addresses.Add(
        new Address
        {
            Street = "404 Ocean Drive", 
            Zipcode = "92051", 
            City = "Oceanside", 
            State = "CA"
        });

    //Write it to the database
    db.Put(customer.Id, customer);

    //Query by id
    Customer c2 = db.Get<Customer>(customer.Id);
    Console.WriteLine(c2.Id + ": " + c2.Name);

    //Query using ad-hoc lambda
    foreach (var c in db.Get(docs => docs.Values.OfType<Customer>().ToArray()))
    {
        Console.WriteLine(c.Id + ": " + c.Name);    
    }

    //Query by passing implicit delegate
    int results = db.Get(CountCustomersQuery);

    Console.ReadLine();
}

Document-Oriented vs. Domain Specific Model

Using domain-specific task-oriented commands, the logic is executed directly within the model using an exclusive lock. Each command ensures the model is valid before altering its state. Also, the command is the implicit unit of work. There is no real need for explicit transactions grouping multiple commands as this can instead be achieved using a composite command pattern.

With a document-oriented approach, documents are pulled from the database, manipulated and then saved back to the database. In this case, concurrency becomes an issue, two clients can modify the same document and the last one to save will win. This can be remedied by adding support for optimistic concurrency. Also, the operations are CRUD-based (put, remove) so business transactions will need to be composed of multiple operations. So support for explicit transactions is probably a needed feature.

Thank you for stopping by, feel free to leave a comment, send an email, follow me on twitter (@robertfriberg) etc. Any feedback is much appreciated!

License

This article, along with any associated source code and files, is licensed under The MIT License