CSQL Cache is an open-source high performance, bi-directional
updateable data-caching infrastructure that sits between the clustered
application process and back-end data sources to provide unprecedented
high throughput to your application.
Improving Database Performance Using Database Cache
Many applications today are being developed and deployed on
multi-tier environments that involve browser-based clients, web
application servers and backend databases. These applications need to
generate web pages on-demand by talking to backend databases because of
their dynamic nature, making middle-tier database caching an effective
approach to achieve high scalability and performance. Following are the
advantages of database caching
Scalability: distribute query workload from backend to multiple cheap front-end systems.
Flexibility: achieve QoS, where each cache hosts different parts of the
backend data, e.g., the data of Platinum customers are cached while
that of ordinary customers is not.
Availability: by continued service for applications that depend only on cached tables even if the backend server is unavailable.
Performance: by potentially responding fast because of locality of data
and smoothing out load peaks by avoiding round-trips between
middle-tier and data-tier
In order to overcome the throughput barrier, application scales
through deployment of multiple small systems. Companies have developed
various homegrown solutions that involve database caching. These
caching solutions can help accelerate database performance to some
extent, but they are fairly ineffective as most of them support only
result set caching and some are poor at dealing with the scalability.
Some of these caching solutions use another heavy weight full-fledged
database management system to cache the data, which yields less
performance gain. These caching solutions are mostly read only or else
provide tools for doing manual lazy updates. For frequently changing
data, it will be holding “dirty” cached data, resulting in long latency
periods that may be entirely unacceptable for applications requiring
immediate access to current data.
Requirements of Caching Solution
Updateable Cache Tables
Most of the existing cache solutions are read only which limits their
usage to small segment of the applications, non-real time applications.
Bi-Directional Updates
For updateable caches, updates, which happen in cache, should be
propagated to the target database and any updates that happen directly
on the target database should come to cache automatically.
Synchronous and Asynchronous update propagation
The updates on cache table shall be propagated to target database in
two modes. Synchronous mode makes sure that after the database
operation completes the updates are applied at the target database as
well. In case of Asynchronous mode the updates are delayed to the
target database.
Synchronous mode gives high cache consistency and is suited for real
time applications. Asynchronous mode gives high throughput and is
suited for near real time applications.
Multiple cache granularity: Database level, Table level and Result-set caching
Major portions of corporate databases are historical and infrequently
accessed. But, there is some information that should be instantly
accessible like premium customer’s data, etc
Recovery for cached tables
Incase of system or power failure, during the restart of caching
platform all the committed transactions on the cached tables should be
recovered.
Tools to validate the coherence of cache
In case of asynchronous mode of update propagation, cache at different
cache nodes and target database may diverge. This needs to be resolved
manually and the caching solution should provide tools to identify the
mismatches and take corrective measures if required.
Horizontally Scalable
Clustering is employed in many solutions to increase the availability
and to achieve load balancing. Caching platform should work in a
clustered environment spanning to multiple nodes thereby keeping the
cached data coherent across nodes.
Transparent access to non-cached tables reside in target database
Database Cache should keep track of queries and should be able
to intelligently route to the database cache or to the origin database
based on the data locality without any application code modification.
Transparent Fail over
There should not be any service outages, incase of caching platform
failure. Client connections should be routed to the target database.
No or very minimal changes to application for the caching solution
Support for standard interfaces JDBC, ODBC etc that will make the
application to work seamlessly without any application code changes. It
should route all stored procedure calls to target database so that they
don’t need to be migrated.
CSQL Cache
CSQL Cache is a high performance, bi-directional updateable
data-caching infrastructure that sits between the clustered application
process and back-end data sources to provide unprecedented high
throughput to your application by offloading the computing cycles from
expensive backend systems along with reduction in costly network calls,
thereby enabling real time application to provide faster and predictive
response time.
CSQL Cache uses the fastest Main Memory Database (CSQL MMDB)
designed for high performance and high volume data computing for
caching the table and provides most flexible and cost-effective way to
cache and manage enterprise information without compromising on
transactional and indexed access to the data. This main memory database
is 10-20 times faster than traditional disk based database system as
the database completely resides in main memory and developed to be used
on real time high computing data platforms.
CSQL implements most of the requirements of good caching solution
mentioned in the previous section except few in the latest version. For
complete set of features supported by CSQL cache in the latest version,
refer the data sheet on the product web site. http://www.csqlcache.com
Structure of Code
The code is mainly divided into five major modules
Storage Engine - heart of the database management system and deals with physical storage and transaction processing.
SQL Engine - parses and executes the SQL statements by interacting with storage engine
Standard Interfaces - JDBC/ODBC/ISQL to access the main memory database from application programs written in C/C++/Java
ODBC Adapter - Caching modules which acts as connector for any database which implements ODBC API.
Gateway - Caching modules which deals with routing of queries based on the locality of the table.
CSQL Caching Configuration for MySQL
Configuration file, csql.conf has five parameters associated with caching. They can
be found in the Cache section of the csql.conf file. They are CACHE_TABLE, DSN,
ENABLE_BIDIRECTIONAL_CACHE, CACHE_RECEIVER_WAIT_SECS and
TABLE_CONFIG_FILE.
• CACHE_TABLE is a boolean parameter which needs to be set to true when csql
should cache tables from MySQL
• DSN is a string parameter, which needs to be set to the data source name of the
MySQL ODBC driver specified in the odbc.ini file.
• TABLE_CONFIG_FILE is a string parameter which contains the complete path
to the file which holds the cache table information
• ENABLE_BIDIRECTIONAL_CACHE is a boolean parameter which needs to be
set to true when direct updates to MySQL needs to be brought into CSQL cache
table automatically.
• CACHE_RECEIVER_WAIT_SECS is an integer parameter, which needs to be
set to interval it waits if there are no update logs from the target database.
If csql needs to cache table t1 and t2, then entries for t1 and t2 needs to be
present in csqltable.conf file. Add the following lines to TABLE_CONFIG_FILE that is
1:t1
1:t2
A sample configuration file is present in the csql root directory. If you want to cache
tables at run time, that is when csqlserver is running, then cachetable tool shall be used
for that. Refer tool reference section for the syntax and usage.
The first field denotes the mode and it should be specified as always 1, which represents
the update propagation mode for cached tables.
1-> Synchronous mode (updates are propagated during the DML operation itself)
2-> Asynchronous mode (logs are generated and propagated later for DML
operations on cached tables).
For CSQLCache to work you need to install the MySQL server, MySQL ODBC
Connector, unixODBC packages on your system. Please make sure that these packages
are installed in your system before you proceed.
Once you install mysqlserver, start the server by using the following command after
logging in with user ‘mysql’
$/etc/init.d/mysqld start
After that you have to install MySQL ODBC connector, which contains the ODBC driver
to connect to MySQL server. This shall be downloaded from the MySQL web site.
Usually this library is named as libmyodbc3.so.
After this you need to install unixODBC package, which is the driver manager for ODBC
drivers. Once you install unixODBC, copy the following lines into ~/.odbc.ini file
[ODBC Data Sources]
myodbc3 = MyODBC 3.51 Driver DSN
[myodbc3]
Driver = /home/csql/mysql-connector-odbc-
3.51.23-linux-x86-32bit/lib/libmyodbc3.so
Description = Connector/ODBC 3.51 Driver DSN
SERVER = localhost
PORT = 3306
USER = root
Password =
Database = test
OPTION = 16
SOCKET = /var/lib/mysql/mysql.sock
The above assumes that MySQL ODBC connector is installed at location
/home/csql/mysql-connector-odbc-3.51.23-linux-x86-
32bit
You can check whether you have configured MySQL ODBC driver correctly using the
isql command
Create the tables in MySQL server so that they shall be cached in CSQL.
In the SQL prompt enter the following statements
SQL>CREATE TABLE t1 (f1 integer, f2 char (196),
primary key (f1));
SQL>CREATE TABLE t2 (f1 integer, f2 integer, primary
key (f1));
SQL>CREATE TABLE t3 (f1 integer, f2 integer, primary
key (f1));
SQL>INSERT INTO t1 (f1, f2) values (100, '100');
SQL>INSERT INTO t2 (f1, f2) values (102, 102);
SQL>INSERT INTO t3 (f1, f2) values (103, 103);
SQL>quit;
The above statements create tables namely t1, t2 and t3, which will be cached in CSQL
later.
Working with CSQL gateway
In another terminal run csql tool with -g option. This creates an isql session which acts as
gateway to csql and mysql.
$ csql -g
It will show the CSQL prompt as follows
CSQL>
To retrieve records from table t1, enter the following statement in CSQL prompt.
CSQL>select * from t1;
It will display
---------------------------------------------------------
f1 f2
---------------------------------------------------------
100 100
It displays the values inserted into MySQL from CSQL.
You can also perform any DML operations on these cached tables, for example
CSQL>insert into t1 values (200, '200');
This will insert one record in CSQL as well as in MySQL. You can verify this by
accessing MySQL through isql tool.
$isql myodbc3;
SQL>select * from t1;
+-----------+-----------------------------------------
------------------------------------------------------
------------------------------------------------------
------------------------------------------------+
| f1 |f2
|
+-----------+-----------------------------------------
------------------------------------------------------
------------------------------------------------------
------------------------------------------------+
| 100 | 100
|
| 200 | 200
|
+-----------+-----------------------------------------
------------------------------------------------------
------------------------------------------------------
------------------------------------------------+
It displays both the records from MySQL database.
You can also retrieve records in tables, which are not cached in CSQL and are present in
MySQL
$ csql -g
CSQL>select * from t3;
It will display
24083:3086153424:DatabaseManagerImpl.cxx:599:Table not
exists t3
24083:3086153424:SelStatement.cxx:245:Unable to open
the table:Table not exists
------------------------------------------------------
---
f1 f2
------------------------------------------------------
---
103 103
First it displays that the table is not present in CSQL, then it checks with MySQL whether
the table is present there and if present it retrieves the records from MySQL.
Sample Code
Programming with CSQL gateway
You can find a JDBC sample program gwexample.java under the
examples/jdbc directory. It will demonstrate how to use the CSQL gateway through
Java programs.
Note: You shall also use gateway through JDBC and SQLAPI interfaces. The ODBC
interface does not support gateway in this release.
From SQL API, use CSqlGateway to create connection and statement objects in
SqlFactory class like below:
AbsSqlConnection *con = SqlFactory::
createConnection(CSqlGateway);
rv = con->connect("root", "manager");
if (rv != OK) return 1;
AbsSqlStatement *stmt = SqlFactory::
createStatement(CSqlGateway);
stmt->setConnection(con);
Open source web site:
http://sourceforge.net/projects/csql