Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / SQL

Accelerate database performance using In Memory CSQL Cache

2.04/5 (9 votes)
3 Jun 2008CPOL8 min read 1   24  
CSQL Cache is an open-source high performance, bi-directional updateable data-caching infrastructure that sits between the clustered application process and back-end data sources to provide unprecedented high throughput to your application
C++

Download csql-src-2.0GA.zip - 1.5 MB

Latest CSQL Cache Source is here

CSQL Cache is an open-source high performance, bi-directional updateable data-caching infrastructure that sits between the clustered application process and back-end data sources to provide unprecedented high throughput to your application.

Improving Database Performance Using Database Cache

Many applications today are being developed and deployed on multi-tier environments that involve browser-based clients, web application servers and backend databases. These applications need to generate web pages on-demand by talking to backend databases because of their dynamic nature, making middle-tier database caching an effective approach to achieve high scalability and performance. Following are the advantages of database caching

Scalability: distribute query workload from backend to multiple cheap front-end systems.
Flexibility: achieve QoS, where each cache hosts different parts of the backend data, e.g., the data of Platinum customers are cached while that of ordinary customers is not.
Availability: by continued service for applications that depend only on cached tables even if the backend server is unavailable.
Performance: by potentially responding fast because of locality of data and smoothing out load peaks by avoiding round-trips between middle-tier and data-tier

In order to overcome the throughput barrier, application scales through deployment of multiple small systems. Companies have developed various homegrown solutions that involve database caching. These caching solutions can help accelerate database performance to some extent, but they are fairly ineffective as most of them support only result set caching and some are poor at dealing with the scalability. Some of these caching solutions use another heavy weight full-fledged database management system to cache the data, which yields less performance gain. These caching solutions are mostly read only or else provide tools for doing manual lazy updates. For frequently changing data, it will be holding “dirty” cached data, resulting in long latency periods that may be entirely unacceptable for applications requiring immediate access to current data.

Requirements of Caching Solution

Updateable Cache Tables

Most of the existing cache solutions are read only which limits their usage to small segment of the applications, non-real time applications.

Bi-Directional Updates

For updateable caches, updates, which happen in cache, should be propagated to the target database and any updates that happen directly on the target database should come to cache automatically.

Synchronous and Asynchronous update propagation

The updates on cache table shall be propagated to target database in two modes. Synchronous mode makes sure that after the database operation completes the updates are applied at the target database as well. In case of Asynchronous mode the updates are delayed to the target database.
Synchronous mode gives high cache consistency and is suited for real time applications. Asynchronous mode gives high throughput and is suited for near real time applications.

Multiple cache granularity: Database level, Table level and Result-set caching

Major portions of corporate databases are historical and infrequently accessed. But, there is some information that should be instantly accessible like premium customer’s data, etc

Recovery for cached tables

Incase of system or power failure, during the restart of caching platform all the committed transactions on the cached tables should be recovered.

Tools to validate the coherence of cache

In case of asynchronous mode of update propagation, cache at different cache nodes and target database may diverge. This needs to be resolved manually and the caching solution should provide tools to identify the mismatches and take corrective measures if required.

Horizontally Scalable

Clustering is employed in many solutions to increase the availability and to achieve load balancing. Caching platform should work in a clustered environment spanning to multiple nodes thereby keeping the cached data coherent across nodes.

Transparent access to non-cached tables reside in target database

Database Cache should keep track of queries and should be able to intelligently route to the database cache or to the origin database based on the data locality without any application code modification.

Transparent Fail over

There should not be any service outages, incase of caching platform failure. Client connections should be routed to the target database.

No or very minimal changes to application for the caching solution

Support for standard interfaces JDBC, ODBC etc that will make the application to work seamlessly without any application code changes. It should route all stored procedure calls to target database so that they don’t need to be migrated.

CSQL Cache

CSQL Cache is a high performance, bi-directional updateable data-caching infrastructure that sits between the clustered application process and back-end data sources to provide unprecedented high throughput to your application by offloading the computing cycles from expensive backend systems along with reduction in costly network calls, thereby enabling real time application to provide faster and predictive response time.

CSQL Cache uses the fastest Main Memory Database (CSQL MMDB) designed for high performance and high volume data computing for caching the table and provides most flexible and cost-effective way to cache and manage enterprise information without compromising on transactional and indexed access to the data. This main memory database is 10-20 times faster than traditional disk based database system as the database completely resides in main memory and developed to be used on real time high computing data platforms.

CSQL implements most of the requirements of good caching solution mentioned in the previous section except few in the latest version. For complete set of features supported by CSQL cache in the latest version, refer the data sheet on the product web site. http://www.csqlcache.com

Structure of Code

The code is mainly divided into five major modules

Storage Engine - heart of the database management system and deals with physical storage and transaction processing.

SQL Engine - parses and executes the SQL statements by interacting with storage engine

Standard Interfaces - JDBC/ODBC/ISQL to access the main memory database from application programs written in C/C++/Java

ODBC Adapter - Caching modules which acts as connector for any database which implements ODBC API.

Gateway - Caching modules which deals with routing of queries based on the locality of the table.

CSQL Caching Configuration for MySQL

Configuration file, csql.conf has five parameters associated with caching. They can
be found in the Cache section of the csql.conf file. They are CACHE_TABLE, DSN,
ENABLE_BIDIRECTIONAL_CACHE, CACHE_RECEIVER_WAIT_SECS and
TABLE_CONFIG_FILE.
• CACHE_TABLE is a boolean parameter which needs to be set to true when csql
should cache tables from MySQL
• DSN is a string parameter, which needs to be set to the data source name of the
MySQL ODBC driver specified in the odbc.ini file.
• TABLE_CONFIG_FILE is a string parameter which contains the complete path
to the file which holds the cache table information
• ENABLE_BIDIRECTIONAL_CACHE is a boolean parameter which needs to be
set to true when direct updates to MySQL needs to be brought into CSQL cache
table automatically.
• CACHE_RECEIVER_WAIT_SECS is an integer parameter, which needs to be
set to interval it waits if there are no update logs from the target database.
If csql needs to cache table t1 and t2, then entries for t1 and t2 needs to be
present in csqltable.conf file. Add the following lines to TABLE_CONFIG_FILE that is

1:t1
1:t2

A sample configuration file is present in the csql root directory. If you want to cache
tables at run time, that is when csqlserver is running, then cachetable tool shall be used
for that. Refer tool reference section for the syntax and usage.
The first field denotes the mode and it should be specified as always 1, which represents
the update propagation mode for cached tables.
1-> Synchronous mode (updates are propagated during the DML operation itself)
2-> Asynchronous mode (logs are generated and propagated later for DML
operations on cached tables).

For CSQLCache to work you need to install the MySQL server, MySQL ODBC
Connector, unixODBC packages on your system. Please make sure that these packages
are installed in your system before you proceed.
Once you install mysqlserver, start the server by using the following command after
logging in with user ‘mysql’
$/etc/init.d/mysqld start
After that you have to install MySQL ODBC connector, which contains the ODBC driver
to connect to MySQL server. This shall be downloaded from the MySQL web site.
Usually this library is named as libmyodbc3.so.
After this you need to install unixODBC package, which is the driver manager for ODBC
drivers. Once you install unixODBC, copy the following lines into ~/.odbc.ini file

[ODBC Data Sources]
myodbc3 = MyODBC 3.51 Driver DSN
[myodbc3]
Driver = /home/csql/mysql-connector-odbc-
3.51.23-linux-x86-32bit/lib/libmyodbc3.so
Description = Connector/ODBC 3.51 Driver DSN
SERVER = localhost
PORT = 3306
USER = root
Password =
Database = test
OPTION = 16
SOCKET = /var/lib/mysql/mysql.sock

The above assumes that MySQL ODBC connector is installed at location
/home/csql/mysql-connector-odbc-3.51.23-linux-x86-
32bit
You can check whether you have configured MySQL ODBC driver correctly using the
isql command

Create the tables in MySQL server so that they shall be cached in CSQL.
In the SQL prompt enter the following statements

SQL>CREATE TABLE t1 (f1 integer, f2 char (196),
primary key (f1));
SQL>CREATE TABLE t2 (f1 integer, f2 integer, primary
key (f1));
SQL>CREATE TABLE t3 (f1 integer, f2 integer, primary
key (f1));
SQL>INSERT INTO t1 (f1, f2) values (100, '100');
SQL>INSERT INTO t2 (f1, f2) values (102, 102);
SQL>INSERT INTO t3 (f1, f2) values (103, 103);
SQL>quit;

The above statements create tables namely t1, t2 and t3, which will be cached in CSQL
later.

Working with CSQL gateway

In another terminal run csql tool with -g option. This creates an isql session which acts as
gateway to csql and mysql.

$ csql -g

It will show the CSQL prompt as follows

CSQL> 

To retrieve records from table t1, enter the following statement in CSQL prompt.

CSQL>select * from t1;

It will display
---------------------------------------------------------
f1 f2
---------------------------------------------------------
100 100
It displays the values inserted into MySQL from CSQL.
You can also perform any DML operations on these cached tables, for example
CSQL>insert into t1 values (200, '200');
This will insert one record in CSQL as well as in MySQL. You can verify this by
accessing MySQL through isql tool.

$isql myodbc3;
SQL>select * from t1;

+-----------+-----------------------------------------
------------------------------------------------------
------------------------------------------------------
------------------------------------------------+
| f1 |f2
|
+-----------+-----------------------------------------
------------------------------------------------------
------------------------------------------------------
------------------------------------------------+
| 100 | 100
|
| 200 | 200
|
+-----------+-----------------------------------------
------------------------------------------------------
------------------------------------------------------
------------------------------------------------+
It displays both the records from MySQL database.
You can also retrieve records in tables, which are not cached in CSQL and are present in
MySQL

$ csql -g
CSQL>select * from t3;

It will display
24083:3086153424:DatabaseManagerImpl.cxx:599:Table not
exists t3
24083:3086153424:SelStatement.cxx:245:Unable to open
the table:Table not exists
------------------------------------------------------
---
f1 f2
------------------------------------------------------
---
103 103
First it displays that the table is not present in CSQL, then it checks with MySQL whether
the table is present there and if present it retrieves the records from MySQL.

Sample Code

Programming with CSQL gateway
You can find a JDBC sample program gwexample.java under the
examples/jdbc directory. It will demonstrate how to use the CSQL gateway through
Java programs.
Note: You shall also use gateway through JDBC and SQLAPI interfaces. The ODBC
interface does not support gateway in this release.
From SQL API, use CSqlGateway to create connection and statement objects in
SqlFactory class like below:

AbsSqlConnection *con = SqlFactory::
createConnection(CSqlGateway);
rv = con->connect("root", "manager");
if (rv != OK) return 1;
AbsSqlStatement *stmt = SqlFactory::
createStatement(CSqlGateway);
stmt->setConnection(con);
//---other statements follow---- 

Open source web site:

http://sourceforge.net/projects/csql

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)