Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

C++ Object Relational Mapping (ORM)- Eating the Bun - Part 1 of N

0.00/5 (No votes)
20 Apr 2019 4  
Creating a simple ORM for C++ on-top of SQL database

Introduction

Object Relational Mapping is the process of mapping data types between an object-oriented language as C++ to a relational type system as SQL. So what is the challenge? C++ has different types of primitive types such as int, char, float, double and variations of that. So it's a real challenge to map all these to an actual SQL type. There may or may not be an exact type that is similar to the C++ types. Say for float, C++ and SQL may support different kind of standards too. So there are different tools to do this job. There are a lot of matured libraries too out there in the market. ODB is one that is really nice.

To help me in my daily work, I have created a simple C++ library called as Bun.

What's New?

  • * Bun 1.5.0 Convert an object with vector to JSON and Msgpack and create an object from a JSON containing a vector. NOTE: It does not include vector persistence yet. Its still in development.
  • Bun 1.4.0 Has support for converting objects to JSON and create Objects from JSON. It has the capabality to convert Objects to Message pack and construct object from message pack.
  • Bun 1.3 Has support for Object lazy iteration and ranges based for loop support. Same is supported for the key-value store too.
  • Bun 1.2 has support for embedded key-value store. But default, the key-value store is based on Unqlite.

Features

  • Easy to Use
  • Work with Plain Old C++ Objects (POCO)
  • Object persistence - You can persist C++ objects directly
  • Not intrusive - You do not have to modify the classes to make it persistent
  • Constraint Specification in plain C++
  • Persist Nested Objects
  • EDSL Object Query Language (No SQL Query needed)
  • Compile time EDSL syntax check for type safety - Catch bugs before the execution starts
  • Multiple database support - SQLite, Postgres, MySQL
  • Easy to use embedded key-value store
  • Convert C++ objects to JSON and create C++ objects from JSON.
  • Convert C++ objects to Message Pack and create C++ objects from Message Pack.
  • STL Friendly. These are regular C++ objects. So can be used in C++ STL algorithms.

Who is using Bun?

This section describes who all are using Bun and in what context. If you find Bun useful and use it please let me know I will add it here.

  1. A little adventure with PI.

Background

In a lot of my tools application, I use SQLite as the primary db. Every time I use SQL queries, I feel like wasting a lot of energy in the task that is not really related to my actual use case. So I thought of creating a framework for the automated mapping of these types. The criteria for the library is as follows:

  1. Free to use for any kind of project (BSD License)
  2. Easy to use (no SQL query knowledge needed)
  3. Provide constraint like unique key constraints for fields
  4. No SQL queries needed. EDSL query.
  5. Non intrusive
  6. Expressive
  7. Should be DSL for C++ so the queries syntax can be checked by the C++ compiler
  8. No customization compiler needed (C++11 and above)
  9. Performant
  10. Support for multiple database backends like SQLite, Postgres, MySQL
  11. Easy embedded key-value Store

All of these haven't been met till now. Going on eventually, I will be addressing all these issues. Currently, only a basic version of the library has been developed.

Using the Code

Bun Object Store Interface

Before we get into the gory details of the internals, in this first article, let's see how to use the library.

Bun has a BSD 3-Clause License. It depends on the following opensource and free libraries:

  1. boost (I have tested on 1.61 version, Boost License)
  2. fmt (Small, safe and fast formatting library, BSD License)
  3. spdlog (Fast C++ logging, MIT License)
  4. SQLite (Self-contained, serverless, zero-configuration, transactional SQL database engine, Public domain)
  5. SOCI (C++ database layer, BSL License)
  6. JSON for modern C++ (C++ JSON and Message pack utility, MIT License)
  7. Rapid JSON (Fast C++ JSON library, See License)

The GitHub page contains all the dependencies needed. It contains a Visual Studio 2015 solution file also for ease of use. Boost and SOCI are not included. To download the project, put the boost headers under the "include" directory or change the solution file path in the solution file. Build SOCI (very easy to build using cmake) and link the libraries with Bun.

#include "blib/bun/bun.hpp"

namespace test {
  // Class that needs to be persisted
  struct Person {
    std::string name;
    std::string uname;
    int age;
    float height;
  };
}

/// @class Child 
struct Child {
    int cf1;
    Child(const int cf = -1) : cf1(cf) {}
    Child& operator=(const int i) {
        cf1 = i;
        return *this;
    }
};

/// @class Paret
struct Parent {
    int f1;
    std::string f2;
    // Nested object
    Child f3;
    Parent() :f1(-1), f2("-1"), f3(-1) {}
};

// Both should be persistable
SPECIALIZE_BUN_HELPER((Child, cf1));
SPECIALIZE_BUN_HELPER((Parent, f1, f2, f3));

/////////////////////////////////////////////////
/// Generate the database bindings at compile time.
/////////////////////////////////////////////////
SPECIALIZE_BUN_HELPER( (test::Person, name, uname, age, height) );

int main() {
  namespace bun = blib::bun;
  namespace query = blib::bun::query;

  // Connect the db. If the db is not there it will be created.
  // It should include the whole path
  // For SQLite
  //bun::connect( "objects.db" );
  // For PostGres
  bun::connect("postgresql://localhost/postgres?user=postgres&password=postgres");
  // Get the fields of the Person. This will be useful in specifying constraints and also
  // querying the object.
  using PersonFields = query::F<test::Person>;

  // Generate the configuration. By default it does nothing.
  blib::bun::Configuration<test::Person> person_config;
  // This is a unique key constraints that is applied.
  // Constraint are applied globally. They need to be set before the
  // execution of the create schema statement
  // The syntax is Field name = Constraint
  // We can club multiple Constraints as below in the same statement.
  // There is no need for multiple set's to be called. This is how
  // We can chain different constraints in the same statement
  person_config.set(PersonFields::name = blib::bun::unique_constraint)
                   (PersonFields::uname = blib::bun::unique_constraint);
  
  // Create the schema. We can create the schema multiple times. If its already created
  // it will be safely ignored. The constraints are applied to the table.
  // Adding constraints don't have effect if the table is already created
  bun::createSchema<test::Person>();
  
  // Start transaction
  bun::Transaction t;
  // Create some entries in the database
  for (int i = 1; i < 1000; ++i) {
    // PRef is a reference to the persistent object.
    // PRef keeps the ownership of the memory. Release the memory when it is destroyed.
    // Internally it holds the object in a unique_ptr
    // PRef also has a oid associated with the object
    bun::PRef<test::Person> p = new test::Person;

    // Assign the members values
    p->age = i + 10;
    p->height = 5.6;
    p->name = fmt::format( "Brainless_{}", i );
    // Persist the object and get a oid for the persisted object.
    const bun::SimpleOID oid = p.persist();

    //Getting the object from db using oid.
    bun::PRef<test::Person> p1( oid );
  }
  // Commit the transaction
  t.commit();

  // To get all the object oids of a particular object.
  // person_oids is a vector of type std::vector<blib::bun<>SimpleOID<test::Person>>
  const auto person_oids = bun::getAllOids<test::Person>();

  // To get the objects of a particular type
  // std::vector<blib::bun::Pref<test::Person>>
  const auto person_objs = bun::getAllObjects<test::Person>();

  // EDSL QUERY LANGUAGE ----------------------
  // Powerful EDSL object query syntax that is checked for syntax at compile time.
  // The compilation fails at the compile time with a message "Syntax error in Bun Query"
  using FromPerson = query::From<test::Person>;
  FromPerson fromPerson;
  // Grammar are checked for validity of syntax at compile time itself.
  // Currently only &&, ||, <, <=, >, >=, ==, != are supported. They have their respective meaning
  // Below is a valid query grammar
  auto valid_query = PersonFields::age > 10 && PersonFields::name != "Brainless_0";
  std::cout << "Valid Grammar?: " << query::IsValidQuery<decltype(valid_query)>::value << std::endl;

  // Oops + is not a valid grammar
  auto invalid_query = PersonFields::age + 10 && 
  PersonFields::name != "Brainless_0";
  std::cout << "Valid Grammar?: " << 
  query::IsValidQuery<decltype(invalid_query)>::value << std::endl;

  // Now let us execute the query.
  // The where function also checks for the validity of the query, and fails at compile time
  const auto objs = fromPerson.where( valid_query ).where( valid_query ).objects();
  // Can even use following way of query
  // As you see we can join queries 
  const auto q = PersonFields::age > 21 && PersonFields::name == "test";
  const auto objs_again = FromPerson().where( q ).objects();
  const auto objs_again_q = FromPerson().where( PersonFields::age > 21 
  && PersonFields::name == "test" ).objects()
  // Not going to compile if you enable the below line. 
  // Will get the "Syntax error in Bun Query" compile time message.
  // const auto objs1 = FromPerson.where( invalid_query ).objects();

  // Check the query generated. It does not give the sql query.
  std::cout << fromPerson.query() << std::endl;

  // Support for Nested object persistence and retrieval
  blib::bun::createSchema<Child>();
  blib::bun::createSchema<Parent>();
  std::cout << "How many objects to insert? " << std::endl;
  int count = 0;
  std::cin >> count;
  for (int i = 0; i < count; ++i) {
      blib::bun::l().info("===============Start===================");
      blib::bun::PRef<Parent> p = new Parent;
      p->f1 = i;
      p->f2 = i % 2 ? "Delete Me" : "Do not Delete Me";
      p->f3 = 10 * i;
      // Persists the Parent and the Nested Child
      p.persist();
      std::cout << "Added to db: \n" << p.toJson() << std::endl;
      blib::bun::l().info("===============End===================\n");
    }
    
    std::cout << "Get all objects and show" << std::endl;
    auto parents = blib::bun::getAllObjects<Parent>();
    // Iterate and delete the Parent and the nested Child
    // Here p is a PRef type. We can modify the object and persist 
    // the changes if needed.
    for (auto p : parents) {
        std::cout << p.toJson() << std::endl;
        p.del();
    }

  return 0;
}

So this is how we persist the object. After running this, the following list is created in the SQLite database:

Now let's have a deeper look at few elements here. The DDL for the schema is as follows:

CREATE TABLE "test::Person" (object_id INTEGER NOT NULL, name TEXT, age INTEGER, height REAL);

This schema is created internally by the library. I am just showing it here for reference.

The data is as follows:

Persistent Store

oid name age height
90023498019372 Brainless_1 11 5.6
90023527619226 Brainless_2 12 5.6
90023537497149 Brainless_3 13 5.6
90023553459526 Brainless_4 14 5.6
90023562946990 Brainless_5 15 5.6

Range Based iteration

Bun also supports the iteration of objects using the range based for loop in C++. The following gives a simple example of how this is going to work.

    // Iterate the parent with range based for loop
    using FromParents = query::From<Parent>;
    using ParentFields = query::F<Parent>;
    FromParents from_parents;
    // Select the query which you want to execute
    auto parents_where = from_parents.where(ParentFields::f2 == "Delete Me");
    // Fetch all the objects satisfying the query. This is a lazy fetch. It will be fetched
    // only when it is called. And not all the objects are fetched.
    // Here v is a PRef so it can be used to modify and persist the object.
    for(auto v : parents_where) {
        std::cout << v.toJson() << std::endl;
    }

JSON and Message pack conversion (To Object and From Object)

Now we can convert C++ objects to JSON and create C++ objects from JSON.  We can even convert C++ objects to Message Pack and create C++ objects from message pack

Its very easy just specialize the bun helper then its a childs play.

namespace dbg {
    struct C1 {
        int c1;
        C1() :c1(2) {}
    };

    struct C {
        int c;
        C1 c1;
        C(const int i = 1) :c(i) {}
    };

    struct P {
        std::string p;
        C c;
        P() :p("s1"), c(1) {}
    };
}
SPECIALIZE_BUN_HELPER((dbg::C1, c1));
SPECIALIZE_BUN_HELPER((dbg::C, c, c1));
SPECIALIZE_BUN_HELPER((dbg::P, p, c));

int jsonTest() {
    namespace bun = blib::bun;

    blib::bun::PRef<dbg::P> p = new dbg::P;
    p->p = "s11";
    p->c.c = 10;

    p->c.c1.c1 = 12;


    blib::bun::PRef<dbg::C> c = new dbg::C;
    c->c = 666;
    // Convert the object to JSON
    const std::string json_string = p.toJson();
    // Construct the new object out of JSON
    blib::bun::PRef<dbg::P> p1;
    p1.fromJson(json_string);
    const auto msgpack = p1.toMesssagepack();
    // Construct another object out of messagepack
    blib::bun::PRef<dbg::P> p2;
    p2.fromMessagepack(p1.toMesssagepack());
    // messagepack to string
    std::string msgpack_string;
    for (auto c : msgpack) {
        msgpack_string.push_back(c);
    }
    std::cout << "1. Original object Object:" << json_string << std::endl;
    std::cout << "2. Object from JSON      :" << p1.toJson() << std::endl;
    std::cout << "3. Object to Messagepack :" << msgpack_string << std::endl;
    std::cout << "4. Object from Messagepck:" << p2.toJson() << std::endl;
    std::cout << "=== Vector JSON Conversion ===" << std::endl;
    blib::bun::PRef<bakery::B> b = new bakery::B;
    b->j = "test";
    b->i.push_back(12);
    b->i.push_back(23);
    std::cout << "5. Object with Vector: " << b.toJson() << std::endl;
    blib::bun::PRef<bakery::B> b1 = new bakery::B;
    b1.fromJson(b.toJson());
    std::cout << "6. Object copy with Vector: " << b1.toJson();
    return 1;
}

Key Value Store

Bun has an embedded key-value store. The default implementation is based on Unqlite.

/// @class KVDb
/// @brief The main class for the key value store
template<typename T = DBKVStoreUnqlite>
class KVDb {
public:
    /// @fn KVDb
    /// @param param
    /// @brief The constructor for the KV class
    KVDb(std::string const& param);

    /// @fn KVDb
    /// @param other. The other KVDb from which we can copy values.
    /// @brief The copy constructor for the KV class
    KVDb(KVDb const& other);

    /// @fn ~KVDb
    /// @brief destructor for the KV class
    ~KVDb();

    /// @fn ok
    /// @brief Returns Ok
    bool ok() const;

    std::string last_status() const;

    /// @fn put
    /// @param key The key
    /// @param value the value that needs to be stored
    /// @details Put stores the key and value and returns true of the store is done,
    /// else it returns false
    ///          All primary C++ data types including std::string is supported as key and value
    template<typename Key, typename Value>
    bool put(Key const& key, Value const& value);

    /// @fn get
    /// @param key The key
    /// @param value the value is of type ByteVctorType. This carries the out value
    /// @details Gets the value corresponding the key.
    /// If the retrieval it returns true else it returns false.
    ///          All primary C++ data types including std::string is supported as key.
    ///          The value is a byte (std::uint8_t) value
    template<typename Key>
    bool get(Key const& key, ByteVctorType& value);

    /// @fn get
    /// @param key The key
    /// @param value the value is of type ByteVctorType. This carries the out value
    /// @details Gets the value corresponding the key. If the retrieval it returns true
    /// else it returns false.
    ///          All primary C++ data types including std::string is supported as key.
    ///          The value C++ primary datatype.
    ///          This function is a wrapper on top of the previous function
    ///          which returns the byte vector.
    template<typename Key, typename Value>
    bool get(Key const& key, Value& value);

    /// @fn del
    /// @param key The key
    /// @details Delete the value corresponding to key.
    /// If delete is success then returns true else returns false.
    ///          All primary C++ data types including std::string is supported as key.
    template<typename Key>
    bool del(Key const& key);
};

Following is the way that we can use it:

/// @fn kvTest
/// @brief A test program for 
int kvTest() {
    /// @var db
    /// @brief Create the database. If the database already exists 
    /// it opens the database but creates if it doesnt exist
    blib::bun::KVDb<> db("kv.db");
    /// @brief put a value in database.
    db.put("test", "test");
    std::string val;
    /// @brief get the value. We need to pass a variable by reference to get the value.
    db.get("test", val);
    std::cout << val << std::endl;
    
    const int size = 10000;
    for (int i = 0; i < size; ++i) {
        const std::string s = fmt::format("Value: {}", i);
        db.put(i, s);
    }

    for (int i = 0; i < size; ++i) {
        std::string val;
        db.get(i, val);
        std::cout << val << std::endl;
    }
    
    return 1;
}

Range based iteration of the Key values

Bun supports the range based iteration of the key values of the elements in the kv store. This iteration is like the iteration of maps. The key and value both are returned as a pair. If you see below kv is a pair, the kv.first carries the key value and the kv.second carries the value. The kv.first and kv.second has values as a vector of bytes.

    // ========= KV Store
    blib::bun::KVDb<> db("kv.db");

    const int size = 3;
    for (int i = 0; i < size; ++i) {
        const std::string s = fmt::format("storing number: {}", i);
        db.put(i, s);
    }

    std::cout << "Start iteration Via size "<< std::endl;
    for (int i = 0; i < size; ++i) {
        std::string val;
        db.get(i, val);
        std::cout << val << std::endl;
    }

    std::cout << "Start iteration via foreach "<< std::endl;
    count = 0;
    // Iterate the key value store using foreach.
    // We have both the key and value here. So we can change the value at the key
    for (auto kv : db) {
        int key = 0;
        blib::bun::from_byte_vec(kv.first, key);

        std::string value;
        blib::bun::from_byte_vec(kv.second, value);
        std::cout << count++ << ")> key: "<< key << "\n Value: " << value << std::endl;
    }

Internals

Some of the internals of the ORM are as follows.

Reflection

Bun internally uses simple reflection to generate take care of compile-time type information. There is a plan to extend it a little so it can be more useful.

SPECIALIZE_BUN_HELPER

This macro will generate all the binding for the objects at compile time. All the template specialization is created using this macro. It should be safe to use the macro in multiple headers or CPP files.

The following should be passed to the macro:

(<Class name, should include the namespace details too>, Members to persist ...)

The member list can be partial class members too. Say we have a handle on one of the objects we use, there is no point to store it in the DB. In this case, we can omit the handle and persist all the other features. This way, only the given fields will be populated.

Constraint

Applying constraint is easy in Bun. The following example explains it.

// Get the fields of the Person. This will be useful in specifying constraints and also
// querying the object.
using PersonFields = query::F<test::Person>;

// Generate the configuration. By default it does nothing.
blib::bun::Configuration<test::Person> person_config;
// This is a unique key constrains thats applied.
// Constraint are applied globally. They need to be set before the
// execution of the create schema statement
// The syntax is Field name = Constraint
// Here is how we can chain the different constraints in a single set statement
person_config.set(PersonFields::name = blib::bun::unique_constraint)
                 (PersonFields::uname = blib::bun::unique_constraint);

As you can see its very easy to create unique constraints. As given above we can club together multiple constraints using the overloaded () operator rather than call set multiple times.

Things to remember:

  • For now, constraints can be applied only before the table is created. The statements have no effect after the table is created.
  • The only unique key is supported.

In further releases, I will be removing these limitations.

PRef

PRef is one of the central elements in the library. It holds the object that needs to be persisted. It also contains the oid of the object, which is independent of the actual object. Few rules to make an object persistent:

  • The member that needs to be persisted has to be public.
  • PRef maintains the ownership of the object and deletes the object when it goes out of scope.
  • If we assign a PRef to another, then PRef the former loses the ownership of the object. Just like a unique_ptr. Actually, PRef stores the object in a unique_ptr underneath.
  • Before persisting objects, we have to create the schema (using blib::bun::createSchema<>()) and generate the bindings (using SPECIALIZE_BUN_HELPER( (test::Person, name, age, height) );)
  • It also contains the md5 sum of the object at a particular instance. So if there is no change in the object, then it won't persist it. I have it as in my own use, I keep a timestamp of the update. I do not want to update the object every time. For this public release, I am omitting the time stamp.

Insert or Update

How does the library know if we want to insert or update the database? This happens with the md5 of the object. If the md5 has some value, then it is an update else it's an insert. The following query is automatically generated for the insert:

INSERT INTO 'test::Person' (object_id,name,age,height) VALUES(91340162041484,'Brainless_4',14,5.6)

Search

Searching in Bun is quite easy. There are different mechanisms to search.

  • Oid Search: We can get all the Oids using the method:
    // The return type is std::vector<blib::bun<SimpleOID<test::Person>>
    const auto person_oids = blib::bun::getAllOids<test::Person>();
  • Search all objects of a type: We can get all the objects in the database as a vector of objects:
    // std::vector<blib::bun::Pref<test::Person>>
    const auto person_objs = blib::bun::getAllObjects<test::Person>();
  • Object EDSL: We can search through the EDSL query that Bun provides. The EDSL is implemented using boost proto library. The query is checked in compile time by the C++ compiler. When SPECIALIZE_BUN_HELPER is called, it creates some special variables.

    For example: For the Person class, the SPECIALIZE_BUN_HELPER generates the following:

    bun::query::F<test::Person>::name
    bun::query::F<test::Person>::age
    bun::query::F<test::Person>::heigh

The bun::query::F class of Bun will be specialized with all the fields of Person class.

To apply any kind of filters, you just need to use the "where" function like:

// The where(Query) is a lazy function, it does not query the db.
// The actual execution is done in the object() function 
const auto objs_again = bun::query::From<test::Person>().where( valid_query ).objects();
// We can also join queries or filters using && or the || operator
const auto objs_again = bun::query::From<test::Person>().where( valid_query && valid_query ).objects();

Discussion Forums

  • Gitter.im: here you can ask question for quicker answers or chat with us if we are available
  • Github issues: Create an issue here

History

  • Alpha 1 (16th May 2016)
    • Initial version of the library
  • Alpha 2 (2nd July 2016)
    • Implementing the Bun EDSL
  • Alpha 3 (14th March 2018):
    • Integrated SOCI as the database interaction layer. This makes the library use any SQL database as SQLite, Postgres, MySQL. It mostly supports other databases that SOCI supports but it's not tested yet.
    • Use of Boost Fusion. The code is much cleaner, fewer preprocessor macros The code is more debuggable.
    • Support for transaction handling using the Transaction class
    • Better error handling and error logging
    • Added a lot of comments to help users
  • Alpha 4 (5th March 2018)
    • Support for nested objects
    • SimpleOID now uses boost UUID to generate a unique identifier
    • Additional comments
    • Small performance enhancements
  • Alpha 5 (19th May 2018)
    • Support for constraint before table creation
  • Alpha 6 (18th July 2018):
    • Adding key value functionality to bun
  • Alpha 7 (11 August 2018):
    • Added range based for loop support for object iteration.
    • Added range based for loop support for key-value store iteration.
    • Both the iterations are lazy iterations.
  • Alpha 8 (19 October 2018)
    • Added support to create C++ object from JSON string
    • Added support to create Message Pack from C++ object
    • Added support to create C++ object from Message Pack
  • Alpha 9 (13 January 2018)
    • Added support to convert C++ object containing vector to JSON and Msgpack
    • Added support to convert a JSON or Msgpack containing a vector to C++ object.

Next Features

  • Adding C++ vector persistence
  • Iterator based lazy data pull
  • Custom Oid class support
  • Support for ElasticSearch
  • Improved Error handling
  • EDSL query language enhancements
  • Constraint modification after table creation
  • Support for other constraint
  • Index support
  • Support for pre and post hooks for processing objects
  • Persisting std::vector members
  • Unit test implementation
  • Support for Leveldb
  • Key Value iterator
  • Support for Composite types. (Done)

Help Needed

Hi All,
Considering the work needed to make this library further enrich I will be needing any help needed. Help is needed in the following areas.

  1. Enhancement
  2. Fix bugs.
  3. Restructure and cleanup code.
  4. Enhance documentation.
  5. Constructive criticism and feature suggestions.
  6. Write tests.
  7. Use Bun

Any small or big things is appreciated. 

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here