Quick Introduction to YB.ORM for C++
This article may be interesting, in the first place, to those who develop database applications in C++. We will discuss some basics too, but Intermediate level in databases and C++ is still recommended.
YB.ORM is aiming to simplify the development of C++ applications that interact with relational databases. An Object-Relational Mapper (ORM) works by mapping database tables to classes and table rows to objects in your application. This approach may be not optimal for each and every database application, but it proved to be reasonable in applications that require complex logic and transaction handling. Although it's a work in progress, most features are ready to explore. The goals of YB.ORM project are:
- to provide a convenient API for C++ developers
- to retain high performance of C++
- to keep the source code easily portable across different platforms and compilers
- to support most major relational DataBase Management Systems (DBMS)
This tool employs many concepts explained in the book "Patterns of Enterprise Application Architecture" by Martin Fowler, i.e., patterns such as "Lazy Load", "Identity Map", "Unit of Work", etc. Also, the project development was inspired by the power of Hibernate framework for Java, and especially by the design of SQLAlchemy for Python.
About Object Relational Mapping
Relational databases are now very widespread – from Oracle clusters to embedded SQLite file-based DB. The relational databases operate on data in the form of rectangular tables having columns and rows. There are reasons for such popularity:
- They have got a simple underlying math model – so called relational algebra
- There is a standard and very powerful language SQL to interact with the databases, well, mostly standard
- Plenty of vendors and products to suit wide range of needs in data storage
But interfacing with SQL databases from application code is not that straightforward. Just look at plain old ODBC API. Typical steps to run an SQL statement against a database may include the following:
- Connect to a database, providing parameters: host, port, schema, user name, password, etc.
- Using connection handle, create a cursor and prepare an SQL statement, providing an SQL statement as text
- Using cursor handle, optionally bind output parameters to your output variables, which will receive their values upon fetch is completed
- Using cursor handle, optionally bind input parameters, either positional or named, to your input variables
- Optionally assign input variables their values
- Using cursor, execute the prepared statement, optionally proceed to step 5
- Optionally fetch next row of the result set, if OK then look at your output variables, optionally repeat step 7
- Close the cursor
- Close the connection
Data types of SQL database don't exactly match those found in C or C++. Not to mention that values stored in database may be undefined and not every language has support for undefined values.
So there should be a way to automate these steps. Such libraries as SOCI for C++ do their best to ease sending SQL statements to the DB and retrieving the results. But there is another bunch of problems with SQL statements being hard-coded into your application. Imagine that you run SELECT
from the same table with distinct filters at several locations in your code. And some day you have to add or rename some column in that table...
Object Relational Mapping (ORM) tools help create an additional abstraction layer that is meant to simplify data manipulations in RDBMS. This layer is also known as domain model. That is, an ORM-enabled app typically contains little to no in-line SQL code. Instead, all insertions, updates, deletions and data retrieval are done in terms of domain classes and objects mapped to tables and rows. Basically, an object of domain class with its members corresponds to a single row in mapped table with its columns. ORM layer takes care to issue actual SQL statements to reflect all the changes made to domain objects.
To a certain degree, ORM hides syntax of SQL behind an object-oriented facade. But it would be too naïve to expect from this technology, that it "saves" a developer from having to learn about SQL and database design. It rather saves developer's time otherwise spent on coding and debugging boilerplate code for database interaction. Still, it's convenient to have ORM library handle all those SQL dialect specifics, thus helping to create portable apps. As a bonus, this approach defends your app from SQL code injections. Also, in statically typed languages, like C++, ORM approach enforces parameter type checking at compile time, which is also a good thing.
Example Usage of the Library
Using ORM starts with model definition. It may look like an XML file with tables and relationships, or like in-lined macros in a class declaration, or a template function processing a visitor. This step may or may not require code generation or some other kind of pre-processing. Each of these variants has its own advantages and disadvantages.
Let's consider an example schema with two entities: Client
and Order
. There is an one-to-many relationship between them: one Client
may have zero or more Orders
, each Order
belongs to a Client
. Clients
are stored in table client_tbl
, while their Order
s are stored in table order_tbl
.
At the SQL level, the relationship can be expressed as a foreign key constraint on column client_id
in the child table order_tbl
referencing primary key column id
in the parent table client_tbl
. From the ORM perspective, such relationship is usually represented by the objects' properties. An instance of class Order
has an object-reference property, referencing single parent object of class Client
. From the other side of the relationship, an instance of a class Client
may have a collection-of-objects property (also known as "backref
"), which can be used to iterate all over its children Order
s.
Let's define the mapping schema along with two classes Client
and Order
.
#include "orm/domain_object.h"
#include "orm/domain_factory.h"
#include "orm/schema_decl.h"
class Order;
class Client: public Yb::DomainObject {
YB_DECLARE(Client, "client_tbl", "client_seq", "client",
YB_COL_PK(id, "id")
YB_COL_DATA(dt, "dt", DATETIME)
YB_COL_STR(name, "name", 100)
YB_COL_STR(email, "email", 100)
YB_COL_DATA(budget, "budget", DECIMAL)
YB_REL_ONE(Client, owner, Order, orders, Yb::Relation::Restrict, "client_id", 1, 1)
YB_COL_END)
public:
int get_info() const { return 42; }
};
class Order: public Yb::DomainObject {
YB_DECLARE(Order, "order_tbl", "order_seq", "order",
YB_COL_PK(id, "id")
YB_COL_FK(client_id, "client_id", "client_tbl", "id")
YB_COL(dt, "dt", DATETIME, 0, 0, Yb::Value("sysdate"), "", "", "", "")
YB_COL_STR(memo, "memo", 100)
YB_COL_DATA(total_sum, "total_sum", DECIMAL)
YB_COL_DATA(paid_sum, "paid_sum", DECIMAL)
YB_COL_DATA(paid_dt, "paid_dt", DATETIME)
YB_REL_MANY(Client, owner, Order, orders, Yb::Relation::Restrict, "client_id", 1, 1)
YB_COL_END)
public:
const Yb::Decimal to_be_paid() {
return total_sum - paid_sum.value(0);
}
};
These class declarations can be placed in a header or in a .cpp file. Two more sentences are expected in your .cpp file for the magic to work:
YB_DEFINE(Client)
YB_DEFINE(Order)
Classes Client
and Order
are automatically given a few new data members and methods. There are now mapped properties (id
, dt
, name
, …) at each object of the class. The properties can be used to access the column data in read and write modes, as well as to check for absent value (IS NULL
).
To control the instances of mapped classes, it's necessary to have an instance of class Yb::Session
, which takes care of loading/saving the objects, keeps track of changes, controls the relationships, etc. On creation of Session
, pass a database scheme to it.
int main() {
Yb::init_schema(); Yb::Session session(Yb::theSchema(), "sqlite+sqlite://./tut1.db");
session.create_schema(true);
Now you can instantly use the domain classes just like that:
Order order;
order.total_sum = Yb::Decimal("3.14");
order.paid_sum = Yb::Decimal(0);
order.save(session);
Client client;
client.name = "Some Name";
client.email = "some@email";
client.dt = Yb::now();
client.save(session);
order.owner = Client::Holder(client);
session.commit();
return 0;
}
You can compile the example, link in the libraries ybutil
and yborm
, and it's ready to run. If you like, you might turn on logging to see what's going on under the hood here:
#include "util/nlogger.h"
#include <iostream>
...
Yb::LogAppender appender(std::cerr);
Yb::init_schema(); Yb::Session session(Yb::theSchema(), "sqlite+sqlite://./tut1.db");
session.set_logger(Yb::ILogger::Ptr(new Yb::Logger(&appender)));
Here are the log messages, specific for SQLite DB engine:
14-10-27 14:19:38.489 21962/21962 DEBG sql: exec_direct: CREATE TABLE client_tbl (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
dt TIMESTAMP,
name VARCHAR(100),
email VARCHAR(100),
budget NUMERIC
)
14-10-27 14:19:38.818 21962/21962 DEBG sql: exec_direct: CREATE TABLE order_tbl (
id INTEGER NOT NULL PRIMARY KEY AUTOINCREMENT,
client_id INTEGER NOT NULL,
dt TIMESTAMP DEFAULT CURRENT_TIMESTAMP NOT NULL,
memo VARCHAR(100),
total_sum NUMERIC,
paid_sum NUMERIC,
paid_dt TIMESTAMP
, FOREIGN KEY (client_id) REFERENCES client_tbl(id)
)
14-10-27 14:19:38.842 21962/21962 DEBG orm: flush started
14-10-27 14:19:38.843 21962/21962 DEBG sql: begin transaction
14-10-27 14:19:38.843 21962/21962 DEBG sql:
prepare: INSERT INTO client_tbl (dt, name, email, budget) VALUES (?, ?, ?, ?)
14-10-27 14:19:38.843 21962/21962 DEBG sql: bind: (DateTime, String, String, Decimal)
14-10-27 14:19:38.843 21962/21962 DEBG sql: exec prepared: p1="'2014-10-27
14:19:38'" p2="'Some Name'" p3="'some@email'" p4="NULL"
14-10-27 14:19:38.844 21962/21962 DEBG sql:
prepare: SELECT SEQ LID FROM SQLITE_SEQUENCE WHERE NAME = 'client_tbl'
14-10-27 14:19:38.844 21962/21962 DEBG sql: exec prepared:
14-10-27 14:19:38.844 21962/21962 DEBG sql: fetch: LID='1'
14-10-27 14:19:38.844 21962/21962 DEBG sql: fetch: no more rows
14-10-27 14:19:38.845 21962/21962 DEBG sql: prepare: INSERT INTO order_tbl
(client_id, dt, memo, total_sum, paid_sum, paid_dt) VALUES (?, ?, ?, ?, ?, ?)
14-10-27 14:19:38.845 21962/21962 DEBG sql: bind: (LongInt, DateTime, String, Decimal, Decimal, DateTime)
14-10-27 14:19:38.845 21962/21962 DEBG sql: exec prepared: p1="1"
p2="'2014-10-27 14:19:38'" p3="NULL" p4="3.14" p5="0" p6="NULL"
14-10-27 14:19:38.845 21962/21962 DEBG sql:
prepare: SELECT SEQ LID FROM SQLITE_SEQUENCE WHERE NAME = 'order_tbl'
14-10-27 14:19:38.846 21962/21962 DEBG sql: exec prepared:
14-10-27 14:19:38.846 21962/21962 DEBG sql: fetch: LID='1'
14-10-27 14:19:38.846 21962/21962 DEBG sql: fetch: no more rows
14-10-27 14:19:38.846 21962/21962 DEBG orm: flush finished OK
14-10-27 14:19:38.846 21962/21962 DEBG sql: commit
Note the correct order of insertion (first – parent, second – child). This is achieved by doing the topological sort on the graph of objects. The value of foreign key is assigned automatically, as well as the values of primary keys.
The same effect can be achieved if we manipulate the link between objects from the other side:
client.orders.insert(order);
The domain classes are particularly useful for constructing queries. For example, we need a pager over orders of certain client, let's fetch items from 30 till 39 inclusive:
#include <boost/foreach.hpp>
...
Yb::DomainResultSet<Order> rs = Yb::query<Order>(session)
.filter_by(Order::c.client_id == 32738)
.order_by(Order::c.dt)
.range(30, 40).all();
BOOST_FOREACH(Order order, rs) {
std::cout << order.id << ",";
}
Here, we can see a feature that is implemented differently in different SQL dialects. For example, for SQLite the following SQL code will be issued:
SQL:
SELECT order_tbl.id, order_tbl.client_id, order_tbl.dt, order_tbl.memo,
order_tbl.total_sum, order_tbl.paid_sum, order_tbl.paid_dt
FROM order_tbl WHERE (order_tbl.client_id = ?)
ORDER BY order_tbl.dt
LIMIT ? OFFSET ?
positional params: (32738, 10, 30)
For more examples, downloads, and any further information, please visit the project home page at https://sourceforge.net/projects/yborm/.
Points of Interest
It's definitely not an easy task to implement your own ORM. Just because the task of synchronization of in-memory objects with SQL tables is not trivial in itself. Also there were many accompanying routine problems that needed to be solved before the whole thing could do session.flush()
for the first time.
In 2007, I was participating in a project where simple tasks took much effort to solve, and more complex ones did not get solved at all. Switching to using ORM changed the situation radically. Of course, there is a learning curve for using any ORM tool. I hope you will benefit from using ORM concept in your projects properly.
Thanks for reading. Feedback is appreciated.
History
- 2014-10-30: Posted the initial revision after release YB.ORM 0.4.6