Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / ADO.NET

DataTable caching performance

5.00/5 (8 votes)
18 Apr 2012CPOL6 min read 37.3K   1K  
This project implements caching of DataTables and provides functionality for testing performance.

Performance results. Caching DataTables with more than 1000 records is not worth.

Introduction

In .NET Framework 4, there exists the class System.Runtime.Caching.MemoryCache. DataTables may be stored in that cache so that retrievals access these cached DataTables instead of the database. As will be shown, this makes sense for tables containing not more than about 1000 records. A SQL Server database is required.

Background

In many programs, the SqlDataAdapter.Fill method is used to retrieve data from a database and store in a DataSet. Tables that are accessed frequently, but are changed rarely, could be stored in memory for faster data access. The MemoryCache class provides the Add method with which any object may be added under some key. In this project, DataTables are added to the MemoryCache. When a record is to be retrieved from a table, it is checked if the table exists in the cache. If it exists, the DataTable is retrieved from the cache, otherwise it is retrieved from the database and stored in the cache. The record in the DataTable is found using DataView's RowFilter property.

The performance is tested with a number of test suites. The table sizes of the test suites range from 100 records up to 2000 records. Each test suite adds 100 random records to table "Items", and then makes successive retrievals of one record over one second without cache and then again with cache. The number of retrievals per second are counted and displayed on the grid.

The Solution

  1. Open Tests.Cache.vbproj with Visual Studio 2010, build and run it. The following form appears:
  2. Main form of Tests.Cache project

  3. In the textbox "Connection string", the connection string to the database must be defined. The database must be SQL server.
  4. Choose the menu "File/Recreate table". This will recreate the table "Items" on the database.
  5. Press the button "Prepare test suites". This will make the column headings for each of the test suites. The text box "Number randoms" defines the increment of the table size from one test suite to the next test suite. The text box "Max table size" defines the table size of the last test suite.
  6. By pressing the button "Run test suites", the test suites are executed. The number of retrievals per second are written to the grid.
  7. Open the provided Excel file CacheNotes.xlsx.
  8. Copy paste the results of the grid to the Excel file. More specifically, copy from the grid the lines having Items_Id = 1 and 2, the columns with heading "100:4" and onwards to lines 5 and 6 of the Excel file from column B onwards (Excel coordinates B5-U6).
  9. Look at the charts in the Excel file. Most probably, in the first chart, the light-blue line (with cache) is above the dark-blue line (without cache), whereas in the second chart, the opposite happens.

Troubleshooting

Saving and loading to the database

You may want to test if saving and loading to the database works.

  1. Recreate the table "Items" through the menu "File/Recreate table".
  2. Add 100 random records to the grid by choosing the menu "File/Add Random".
  3. Save the records to the database by pressing the button "Save to db".
  4. Clear the contents of the grid through the menu "File/Clear".
  5. Load the records from the database by pressing the button "Load from db".

Testing the cache

To test the cache, some data must exist on the database, for example the above steps have been completed successfully.

  1. By pressing the button "Load", data is loaded from the cache. If the data is not yet in the cache, it is loaded from the database.
  2. View the cache items by pressing the button "Cache keys".
  3. Delete data on the database by pressing the button "Truncate db".
  4. Click on button "Load", and verify that data is loaded from the cache, although the data does not exist on the database.
  5. Clear the cache by clicking on button "Clear cache".
  6. Click on button "Load", and verify that no data is loaded into the grid.

Performance testing

To test the performance, some data must exist on the database.

  1. To retrieve all the records from the database repeatedly over one second, press the button "Repeated Load from db". After one second, a messagebox will indicate the number of iterations done.
  2. To retrieve all the records from the cache repeatedly over one second, press the button "Repeated Load". After one second, a messagebox will indicate the number of iterations done.

Test suites

A test suite consists of a number of test cases. For example, a test suite tests the retrieval of a record once with cache and once without cache. But the code supports several test cases per test suite by combining three dimensions:

  1. Selection criteria: The method ModCache.GetTestCriteria() may return a number of different selection criteria, for example:
    1. Items_Id=2
    2. Name='abc'
  2. Data access classes: There may be several data access classes. These are defined in DBAccess.DataAccessClasses.
  3. Cache tests: with cache and without cache.

To prepare and run a test case:

  1. Choose the menu item "Options/Fill test suite" to prepare a test suite for all possible combinations of the above three dimensions.
  2. Press the button "Run test suite" to run the test suite. This will run repeatedly each test case for one second, and will write the number of iterations on the grid.

Code documentation

  1. Form1.vb: This is the main form with the grid, menus, and buttons.
    1. Binding: All actions except "Prepare test suites" and "Run test suites" manipulate the dataset ItemsData1, which is bound to the grid. The actions "Prepare test suites" and "Run test suites" unbind the grid from the dataset ItemsData1, and directly manipulate the grid.
    2. Parallelization: During performance testing, the repeated requests are run in parallel in as many threads as specified in the textbox "Nr threads". There are two different methods implementing parallel requests.
      • ParallelFor: uses .NET's Pararallel.For statement.
      • ParallelThreads: spawns the number of required threads.
  2. DBAccess.vb: Encapsulates all database functions of different data access classes. In this version, one data access class is provided, MySQLDataAccess. However, a bigger project may have other data access classes as well, for example a data access class that runs under COM+ and a data access class that does not run under COM+. Look in the comments for DataAccessComPlus and DataAccessNonComPlus to get an idea of how other data access classes may be integrated here.
  3. ItemsData.xsd: Is used as the underlying data structure of the user interface and for loading and saving data to the database. It has the same structure as the "Items" table that is created by the menu item "Recreate table", which in turn uses the SQL statement defined in the resource SqlCreateTableItems.
  4. ModCache.vb: Provides some functions related to the Cache. The most important one is FillDataSetFromDataTable, which searches a record in a DataTable by creating a DataView on the DataTable and setting the RowFilter and Sort properties of the DataView. This code is the reason that it is not worth caching tables with more than 1000 records, because it is slower for big tables to create a DataView and search in the DataView than search on the database.
  5. MySQLDataAccess.vb: A class that provides data access to a database. Other classes may be implemented and they can be added to DBAccess.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)