Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

Extensible Storage Engine

4.94/5 (58 votes)
1 Feb 2011CPOL47 min read 150.7K   1.7K  
A short overview of the Extensible Storage Engine technology.

Introduction

In this article, I will try to give a short overview of the Extensible Storage Engine technology. The technology was introduced three years ago, and it cannot be called a new one. It is used in the creation of such products as Active Directory and Exchange 2000, but the broad audience of developers do not know this. In a charmed circle, it is known as JET Blue.

Around the end of 2005, Microsoft decided to publish JET Blue, having given it a more attractive commercial name, Extensible Storage Engine. They fairly decided that the technology was tested enough and could be useful to many.

Here, as far as possible, I will describe the purpose of the technology, its weaknesses and strengths, and possible scope. However, before that, let's see some typical scenarios of how preservation of state happens.

Typical Scenarios - How an Application Preserves Its State

Many applications need to keep their state in a data warehouse when they are shut down. A developer selects a data warehouse (usually) according to the needs of an application (if there are no special marketing requirements).

For example, it might be enough for an application to keep just the user's settings. It might be necessary for a second application to keep the results of its work. For a third application, it might be important to keep some intermediate information. Depending on these, the programmer can choose different types of data warehouses - a file, the Windows registry, a Relational Database Management System (RDBMS), Active Directory, etc., and perhaps a combination of these.

Frequently, the amount of kept data is a basic criterion at the time of choosing the data warehouse. Besides that, the other important characteristics are speed of access and/or search, and the convenience of working with the chosen data warehouse.

Let us consider these criteria more in detail. Let us assume that I wish to create a new and improved Notepad. Notepad is a desktop application, and I should not have problems choosing a data warehouse - files will be kept in locations specified by the user. In addition, the user's settings will be stored in the Windows Registry.

Next, suppose I want to create a search system that will eclipse Google. Then, I need to think of where to keep the collected URLs. For their storage, as a back-end, I am sure I will choose a well proven (and it is possible, the most accessible) RDBMS. This decision can be named classical for a server application, and it will not require serious reflections on my part.

Now, let's imagine that I am planning to compete with popular mail clients like The Bat and Eudora, or may be even Outlook Express. Eventually, I will face the question - where the program stores emails. Whether to keep them separately: one letter per file? Alternatively, perhaps I should keep all of them in one flat file. Neither represent the optimum solution.

If we choose the scheme "one letter per file", a lot of time will be lost in opening files. As result, "full text search", together with sorting and grouping, will take a lot of time.

On the other hand, using the scheme "all in one file", I will save time in opening the file, but I lose it during a search for the beginning of each mail. Though the implementation of "full text search" becomes easier, as the beginning of each mail is anyway calculated, sorting and grouping, apparently, will demand non-trivial algorithms. In addition, the speed of search on key fields (from, to, subject) will leave much to be desired.

If my mail client will not carry out search, grouping, and sorting, even at the same speed as other mail clients, I will not have enough chances at making a competition. It looks like the brute-force approach (to use file(s) as warehouse) does not work in this situation. That leads me to the approach of using a small database. However, without a server and without a complex installation process. In other words, the relational database must be invisible to the user.

Before making any decisions, let us look at a small table. The most interesting characteristics of such data warehouses, as the Windows registry, a flat file, Active Directory, and RDBMS are reflected in the table:

  Registry File Active Directory RDBMS
The ability to save big volumes of information Not presented Presented Presented (1) Presented
Search velocity - Low High High
The data warehouse is an external service No No Yes Yes
The data warehouse should be installed first No No No (2) Yes
Permissions to write to the data warehouse Permissions to write to Windows Registry Permissions to write to file(s) Permission to write to Active Directory Permission to write to database table(s)
The data warehouse supports transactions No No No Yes
Usage complexity (3) Low Low Middle Middle
  1. The directory is capable of storing big volumes of information. However, Microsoft does not recommend using Active Directory for these purposes. As a replacement, Microsoft recommends using ADAM.
  2. Of course, provided that the application is being deployed in a domain.
  3. Usage complexity is a subjective characteristic. Nevertheless, it seems that many developers feel more confidently at work with the Windows Registry or file system, rather than with Active Directory or an RDBMS.

Here are some comments on the characteristics explained in the table:

  • The data warehouse is an external service - the given line describes dependence (or independence) of an application from the external service. The item can appear important if the viability of the application critically depends on the availability of the data warehouse.
  • The data warehouse should be installed first - this line shows the presence of potential complexities (or restrictions) which will arise at the application's deployment phase.
  • Permissions to write to the data warehouse - this line specifies possible expenses on the administration of the data warehouse which will arise during the operation of the product.
  • Complexity of use - From this line, it is visible how the cost of development increases depending on the data warehouse used.

For our "not so trivial" case, it is obvious from the table that the basic problem in using the Windows Registry and/or a flat file is low speed (or absence - in the case of the Windows registry) of search. When working with the Windows Registry, it is also impossible to store mass data.

Active Directory (if we neglect Microsoft's recommendations) can provide high speed of search, but demands a corresponding environment and permissions. The installation requirement of an RDBMS (or ADAM) brings no advantages for an RDBMS (or ADAM). Besides that, Active Directory, RDBMS, and ADAM can appear inaccessible during the working of an application. It is may be unacceptable for some applications. In addition, it is important to note that RDBMS is a unique data warehouse among the listed ones as it supports transactions.

It is impossible to tell that our "non trivial" case is too specific and idealistic. For example, Microsoft has many products that, by virtue of the architecture, use relational database features without using an RDBMS. Here are the most known of them:

  • DHCP Server
  • WINS Server
  • Exchange Server
  • And of course, Active Directory

For similar problems, Microsoft employs an engine that allows replacing relational databases. The engine refers to JET Blue. JET is an abbreviation of "Joint Engine Technology". Blue is the color of the T-shirt of the manager of this project (joke :)). Actually, I do not know the significance of the word Blue in the name of the library, but it allows distinguishing this engine from another one, which is used by Microsoft Office Access and carries the name JET Red. Except for the prefix JET, there is nothing in common between these two engines. The implementation of JET Blue differs radically from that of JET Red, and on all key parameters, surpasses it.

After this small introduction, let's get into the details of the subject.

Extensible Storage Engine

As has already been mentioned, not so long ago, Microsoft published the JET Blue interface under the name Extensible Storage Engine (ESE). The earliest version of the SDK where ESE heading and lib-files were included was the Windows Server 2003 SP1 Platform SDK. Functions for working with given interface were implemented in a single binary file (esent.dll). The interface is accessible on Windows 2000 and above.

Let us look at what the manufacturer writes about ESE. The introduction (with small edits) from MSDN is quoted below:

The Extensible Storage Engine (ESE) is an advanced indexed and sequential access method (ISAM) storage technology. ESE enables applications to store and retrieve data from tables using indexed or sequential cursor navigation. It supports denormalized schemas including wide tables with numerous sparse columns, multi-valued columns, and sparse and rich indexes. It enables applications to enjoy a consistent data state using transacted data update and retrieval. A crash recovery mechanism is provided so that data consistency is maintained even in the event of a system crash. It provides ACID (Atomic Consistent Isolated Durable) transactions over data and schema by way of a write-ahead log and a snapshot isolation model. Transactions in ESE are highly concurrent, making ESE useful for server applications. It caches data to maximize high performance access to data. [...]

ESE is for use in applications that require fast and/or light structured data storage, where raw file access or the Registry does not support the application's indexing or data size requirements.

It is used by applications that never store more than 1 megabyte of data, and has been used in applications with databases in extreme cases in excess of 1 terabyte, and commonly over 50 gigabytes.

This documentation is intended for developers who are familiar with C and C++, and basic database concepts such as tables, columns, indexes, recovery, and transactions. [...]

The Extensible Storage Engine is a Windows component that was introduced in Windows 2000. Not all features or APIs are available in all versions of the Windows Operating System.

ESE provides a user-mode storage engine that manages data inside flat, binary files that are accessible through Windows APIs. ESE is accessed through a DLL that is loaded directly in the application's process; no remote access methods are required or provided by the database engine itself. Though ESE has no remote or inter-process access method, the data files it uses can be provided remotely by using Server Message Block (SMB) through Windows APIs, but this is not recommended.

Note: Windows XP 64-Bit Edition is the same as Windows Server 2003 for the purpose of determining the ESE feature set that is supported.

There are many specific technical terms here. However, in general, it sounds not bad. We will try to divide this information into categories. For this purpose, we will expand our comparative table, adding ESE in it.

  Registry File ESE Active Directory RDBMS
The ability to save big volumes of information Not presented Presented Presented Presented (1) Presented
Search velocity - Low High High High
The data warehouse is an external service No No No Yes Yes
The data warehouse should be installed first No No No No (2) Yes
Permissions to write to the data warehouse Permissions to write to the Windows Registry Permissions to write to file(s) Permissions to write to file(s) Permission to write to Active Directory Permission to write to database table(s)
The data warehouse supports transactions No No Yes No Yes
Usage complexity (3) Low Low High Middle Middle
  1. See the first comment to the previous table.
  2. See the second comment to the previous table.
  3. See the third comment to the previous table.

JET Blue is an ideal variant from the point of view of the email client. I can store greater volumes of data, can be independent of external services, can carry out fast search and sorting, and can support transactions. What else is necessary for a mail client than a good back-end?

Thus, with the use of the library, it is possible to get the convenience of working with an RDBMS. Nevertheless, it is necessary to note two things which can be important for developers:

  1. It is important not to be under a delusion - ESE is not an RDBMS. ESE doesn't have such habitual RDBMS services as external keys, triggers, kept procedures, user types, differentiation of access rights, and many other things.
  2. Using of ESE increases the cost of development. The usual expenses connected with studying a new technology will be more because the existing documentation in MSDN is modest. Examples of using the ESE API in MSDN, unfortunately, are absent.

General Impression

In the previous section, it was already mentioned that all ESE functionality are implemented in user-mode, in a single binary file (esent.dll). The DLL is directly loaded into the address space of the client application. This lightness together with the powerful opportunities of the engine does honor to its designers.

Prototypes of all the functions are contained in the esent.h file. In the same place, there is a list of all the error messages. If you are going to work on Windows XP, then the function declarations might cause some surprise, but will not create any problems. The problem is that functions working with strings will accept chars (narrow symbols) only for Windows XP. No wchar_t functions are provided.

This does not create major problems because the only "char"-functions are those which operate on names and the placement of binary files, and functions which operate on the naming of tables and columns in tables. An application should transform paths to files from Unicode to ANSI using WideCharToMultibyte (where the CodePage parameter should be set to the current code page - CP_ACP). In addition, the application should use char symbols when naming tables and columns.

During data storage and data reading, the void* array, together with the array size, is used (in particular, during storage and reading of Unicode strings), i.e., storage and reading of Unicode strings does not cause problems.

It is interesting to look at how the engine provides indexing of Unicode strings. To support indexes in columns containing Unicode strings, ESE performs normalization of Unicode strings in these columns. You should not be worried about the "normalization" term. "Normalization" of Unicode strings is an operation of calculating a key that corresponds to a given line. The key, in turn, is also a string. However, unlike the original string, it is structured, and has (usually) length less than the length of the original string. Later, the key generated may be used for ordering and search purposes instead of the original string.

Normalization is done using the LCMapString function, which calculates a key (the exact term - "a key of sorting") using the original string and the locale identifier as parameters. The algorithm leads to an interesting by-effect that forces developers to pay attention when they create indexes for columns with Unicode strings. I will describe this effect in detail below.

The transaction mechanism is another interesting feature. Transactions are implemented using a snapshot technique and differ from the classical isolation ANSI SQL models. In short, a principle of the snapshot technique is the following: at the beginning of a transaction, the database state is frozen. No changes made in one transaction are visible in another. The snapshot technique and its difference from the classical isolation ANSI SQL models are described in the clause "A Critique of ANSI SQL Isolation Levels" (H. Berenson, P. Bernstein, J. Gray, J. Melton, E. O'Neil, and P. O'Neil). We will later return to the discussion of transactions.

It is necessary to add here that the final size of the database can reach up to 16 terabytes. From a practical point of view, it means that the size of a database is limited only by the capacity of the hard disk and by various services such as reserve copying, restoration, defragmentation, etc.

Start of Work

Let us start from something simple - database and tables creation, for example. However, before that, we should initialize the ESE subsystem.

Initialization

Before creating the database (to be more precise, before we start working with ESE), we should initialize ESE. The initialization is performed in two steps:

  1. In the first step, we create an instance of ESE (using a family of functions JetCreateInstance/JetCreateInstance2).
  2. In the second step, the created instance should be initialized (using the JetInit/JetInit2/JetInit3 family).

Let us consider these steps in detail. To create the instance of ESE, the JetCreateInstance or JetCreateInstance2 functions are used.

C++
JET_ERR JET_API JetCreateInstance(
    JET_INSTANCE *pinstance,
    const char *szInstanceName
);

JET_ERR JET_API JetCreateInstance2(
    JET_INSTANCE *pinstance,
    const char *szInstanceName,
    const char *szDisplayName,
    JET_GRBIT grbit
);

This function accepts a pointer to a JET_INSTANCE variable, creates the instance (as we can see from the function name), and returns a pointer to it. We will omit the other parameters. They are very carefully described in MSDN, and give us no essential information from the point of view of the ESE technology. We also will ignore the differences between these two functions (the difference is not important and is described in MSDN). Further, we will do the same. I should point out that the library contains many functions that end with 2 or 3. It seems these functions were created during the library development process. Functions with numbers in their names extend the abilities of the base functions without numbers or the functions with numbers that are lower. Further, for simplicity, we will also consider functions which have shorter signatures.

After the JetCreateInstance call has been performed, the JetInit call transfers the engine to the ready state. This function also accepts a pointer to the JET_INSTANCE variable.

C++
JET_ERR JET_API JetInit(
    JET_INSTANCE *pinstance
);

The pinstance variable should contain the value returned by JetCreateInstance (JetCreateInstance2) or 0 (we will discuss this value later).

If we combine both calls, the initialization code will look like this:

C++
try
{
    JET_INSTANCE instance = JET_instanceNil;
    JET_ERR err = ::JetCreateInstance(
        &instance,  // return value
        "{0A9A6617-8AE9-4c5e-AF28-01D5D4820C23}"
        // unique name of ESE instance
    );
    if(JET_errSuccess != err)
    {
        throw CError(err);
    }

    err = ::JetInit(
        &instance // created ESE instance
    );
    if(JET_errSuccess != err)
    {
        throw CError(err);
    }
}
catch(const CError& e)
{
    ::JetTerm(0);
}

To reduce the amount of code in examples, I will use the exception thrown by the CError class (my own, not described in this article). When the exception is caught, we should (of course) close all open handles (using JetCloseTable, JetEndSession, JetTerm). Also, I will skip calls of these functions in examples.

The previous code example will work fine on Windows XP and Windows 2003 Operating Systems. However, it will not work on Windows 2000. We will get a run time error. The point is that the esent.dll for this Operating System does not contain the JetCreateInstance and JetCreateInstance2 functions. This means we can use only one ("zero") instance of the ESE engine on the Windows 2000 Operating System. That is why we should pass zero to the JetInit function (JetInit2 and JetInit3 are absent on Windows 2000 too) as the value for the pinstance parameter.

Therefore, for Windows 2000, the initialization of the ESE subsystem will look like this:

C++
JET_ERR err = ::JetInit(
    0 // zero is default ESE instance
);
if(JET_errSuccess != err)
{
    throw CError(err);
}

It looks much easier, doesn't it? :)

It is necessary to note that the example above works fine on both Windows 2000 and Windows XP (Windows 2003). By passing zero to the JetInit function as a pointer to an instance, we inform ESE that we are going to operate in legacy mode (Windows 2000 compatible mode). In this mode, ESE supports only one instance of the engine per process.

The next step, once ESE has been initialized, is the creation of the session.

Creating the Session

The session is the context in which all operations with databases are performed. I can draw an analogy to an RDBMS - a session in ESE corresponds to a connection for an RDBMS.

The code for creating a session is straightforward - the session is created using the JetBeginSession call.

C++
JET_ERR JET_API JetBeginSession(
    JET_INSTANCE instance,
    JET_SESID *psesid,
    const char *szUserName,
    const char *szPassword
);

The function accepts a JET_INSTANCE variable which has been created and initialized before (or zero in the case of Windows 2000). The session identifier is returned as a result.

C++
JET_SESID sessionID = JET_sesidNil;
err = ::JetBeginSession(
    0,            // zero as ESE instance
    &sessionID,   // return value
    0,            // reserved
    0             // reserved
);
if(JET_errSuccess != err)
{
    throw CError(err);
}

As long as every session is a context in which all operations with databases are performed, every session controls transaction borders. A transaction may start and finish only within the limits of a session. An application may create more than one session to a database, possibly increasing its performance thereby.

The work related to ESE initialization and creation of the session is now done.

Creating the Database

Use JetCreateDatabase to (surprise :)) create a database. The function accepts the session identifier and the file name of the database (the name can be a full name or a relative one). The function returns the database identifier.

C++
JET_DBID dbID = JET_dbidNil;
err = ::JetCreateDatabase(
    sessionID,    // the session identifier
    "test.db",    // file name
    0,            // reserved
    &dbID,        // return value
    0             // zero flag - just create the database
);
if(JET_errSuccess != err)
{
    throw CError(err);
}

Every ESE instance inside a process (I am sure you remember that Windows 2000 can hold only one ESE instance per process) may create and use up to seven databases.

Creating Tables

The function that creates tables has the following declaration:

C++
JET_ERR JET_API JetCreateTableColumnIndex(
    JET_SESID sesid,
    JET_DBID dbid,
    JET_TABLECREATE *ptablecreate
);

All necessary information is passed using the JET_TABLECREATE structure. The structure declaration (from header file) is given below:

C++
typedef struct tagJET_TABLECREATE
{
    unsigned long cbStruct;              // size of this structure (for future expansion)
    char *szTableName;                   // name of table to create.
    char *szTemplateTableName;           // name of table from which to inherit base DDL
    unsigned long ulPages;               // initial pages to allocate for table.
    unsigned long ulDensity;             // table density.
    JET_COLUMNCREATE *rgcolumncreate;    // array of column creation info
    unsigned long cColumns;              // number of columns to create
    JET_INDEXCREATE *rgindexcreate;      // array of index creation info
    unsigned long cIndexes;              // number of indexes to create
    JET_GRBIT grbit;
    JET_TABLEID tableid;                 // returned tableid.
    unsigned long cCreated;              // count of objects created (columns+table+indexes).
} JET_TABLECREATE;

We will not examine every field of this structure. We omit the cbStruct, szTableName, rgcolumncreate, cColumns, rgindexcreate, cIndexes fields because their purposes are obvious from comments. In addition, we skip szTemplateTableName, ulDensity, and grbit because they are described in detail in MSDN (they can be easily set to null). The ulPages field will be discussed later.

There are two result values. The first one is a cursor identifier of the table (tableid) that has been created. Do not let the table ID name and its type (JET_TABLEID) confuse you. In fact, it is not a table identifier as we can judge by the name and the type, but a cursor identifier. It can be obtained after a table has been created or has been opened. MSDN uses the "cursor" term when refering to this variable.

The second return value is the amount of created objects (cCreated) in the database. The quantity of created columns, table, and indices are included in this number. If an error occurs, the value of the cCreated field is undefined. (I was not able to find a case when this field could be used, so it seems useless.)

Information related to the table's columns is passed using the JET_COLUMNCREATE structure. I list here the structure from the header file:

C++
typedef struct tag_JET_COLUMNCREATE
{
    unsigned long cbStruct;     // size of this structure (for future expansion)
    char *szColumnName;         // column name
    JET_COLTYP coltyp;          // column type
    unsigned long cbMax;        // the maximum length of this column 
                                // (only relevant for binary and text columns)
    JET_GRBIT grbit;            // column options
    void *pvDefault;            // default value (NULL if none)
    unsigned long cbDefault;    // length of default value
    unsigned long cp;           // code page (for text columns only)
    JET_COLUMNID columnid;      // returned column id
    JET_ERR err;                // returned error code
} JET_COLUMNCREATE;

The structure contains no surprises. The most interesting fields are listed here:

  • coltyp - The type of the column. ESE supports all data types required for comfortable development, i.e., integer, floating-point, string, binary data (including two data types for mass data storing, up to 2 147 483 647 bytes - JET_coltypLongBinary and JET_coltypLongText), date, and currency. The entire list can be found in the description of the JET_COLTYP type.
  • cbMax - The maximum length (for string or binary data types).
  • grbit - The flag defines the characteristics of the columns. The number of values for this flag exceeds ten. There are some necessary (and self-described) values like JET_bitColumnNotNULL, JET_bitColumnAutoincrement, and JET_bitColumnMultiValued. However, there are specific ones like JET_bitColumnUserDefinedDefault (this flag means that the value for the column is provided using a callback function implemented by the developer). I have skipped the explanation for all these flag values (they are described in detail in MSDN) except one; I am going to return to this value later.
  • pvDefault - The default value for the column.
  • cp - The code page number that will be used for the column (to be more precise - string column). Only two values are supported - English (1252) and Unicode (1200). Moreover, the English (1252) value is the default value. So, if you pass zero, the 1252 code page is used. (For non-string types, this value, of course, is ignored.)

I would like to add a few words about the grbit flag - about the JET_bitColumnTagged value of this flag. This value is used to mark a column which keeps its data separated from the table. Instead of data, the column contains references only. The data is stored in another table. I will explain what it means soon.

ESE uses the "page size" value when performing operations with a database file. The page size (expressed in bytes) defines the following aspects:

  • First, the database file can only be expanded on "page size" bytes.
  • Secondly, and more importantly, the "page size" sets the maximum size which a row can have in a table.

By default, "page size" is 4096 bytes. This value can be changed using the JetSetSystemParameter function by passing "system" parameter JET_paramDatabasePageSize. The acceptable values are 2048, 4096, and 8192. ESE has many "system" parameters, which rule different aspects of the engine's behavior, for example:

  • Database's location
  • Naming of files
  • Callback functions registration
  • Logging, etc.

All parameters are set using the JetSetSystemParameter function (mentioned above). The function accepts an identifier as parameter name and the parameter's new value - number, pointer, or string (depending on the parameter type).

Thus, to keep data which exceeds "page size", the JET_bitColumnTagged flag is used. Usually, the flag is applied for the JET_coltypLongBinary and JET_coltypLongText data types. Sometimes, it is mandatory to use this flag. For example, for multi-valued columns (JET_bitColumnMultiValued flag), the JET_bitColumnTagged flag should also be passed.

Let us now consider the return values. The JET_COLUMNCREATE structure has two return values like the JET_TABLECREATE structure. However, in this case, both return values are useful. The first return value is an identifier of the just created column (the identifier will be needed for updating the table and selecting from it). The second return value is an error code describing the reason why the column was not created (if it has not been created).

The last structure we need to create a table is JET_INDEXCREATE. The structure describes the primary key and the indices of the table.

C++
typedef struct tagJET_INDEXCREATE
{
    unsigned long cbStruct;     // size of this structure (for future expansion)
    char *szIndexName;          // index name
    char *szKey;                // index key
    unsigned long cbKey;        // length of key
    JET_GRBIT grbit;            // index options
    unsigned long ulDensity;    // index density

union
{
    // lcid for the index (if JET_bitIndexUnicode NOT specified)
    unsigned long lcid;
    // pointer to JET_UNICODEINDEX struct
    // (if JET_bitIndexUnicode specified)
    JET_UNICODEINDEX *pidxunicode;
};

union
{
    unsigned long cbVarSegMac;
    // maximum length of variable length columns
    // in index key (if JET_bitIndexTupleLimits specified)

#ifdef JET_VERSION_SERVER2003
    // pointer to JET_TUPLELIMITS struct
    // (if JET_bitIndexTupleLimits specified)
    JET_TUPLELIMITS *ptuplelimits;
#endif // ! JET_VERSION_SERVER2003
};

    // pointer to conditional column structure
    JET_CONDITIONALCOLUMN *rgconditionalcolumn;
    // number of conditional columns
    unsigned long cConditionalColumn;
    // returned error code
    JET_ERR err;
} JET_INDEXCREATE;

The structure is more interesting then its predecessors. These are the expected attributes:

  • szIndexName - index name
  • grbit - flag (the flag allows to define such properties as if the index is primary or not, if the index is unique, if a NULL value can be used during indexing etc.)
  • ulDensity - index density

The structure contains some interesting fields:

  • szKey - the key that specifies the indexing criteria: the list of fields used for the index and sorting order of these fields. The criteria are defined quite easily and elegantly - the key is a string. The string is terminated by a double null and is composed, in turn, of strings (each of them also null terminated). The sub-string format is: <sorting><column>. Thus, the key "+FirstColumn\0-SecondColum\0" creates an index which used ascending sort order for column "FirstColumn" and descending for "SecondColumn".
  • cbKey - the key's length.
  • lcid - the locale identifier that will be used during Unicode normalization. The documentation does not talk much, but it seems this identifier is passed to the LCMapString function. ESE uses this value only if the flag JET_bitIndexUnicode was not set.
  • pidxunicode - pointer to the JET_UNICODEINDEX structure (if the JET_bitIndexUnicode flag is set). The structure allows better controlling of the LCMapString behavior. It allows passing dwMapFlags, in addition to the lcid parameter, to LCMapString.
  • cbVarSegMac - the number of bytes of column length for every column in the index which will be used to build the index. 255 bytes are used for every column of index if a zero value has been specified (JET_cbPrimaryKeyMost and JET_cbSecondaryKeyMost constants). cbVarSegMac is used only if JET_bitIndexTupleLimits is not specified in grbit. The structure definition and comments have been taken from the esent.h file from the Microsoft Platform SDK for Windows Server 2003 R2. Pay attention to the comment to cbVarSegMac, it contains an error. Instead of the "maximum length of variable length columns in index key (if JET_bitIndexTupleLimits is specified)", the phrase "maximum length of variable length columns in index key (if no JET_bitIndexTupleLimits is specified)" should be written.
  • ptuplelimits - pointer to the JET_TUPLELIMITS structure. The structure allows to specify a parameter for tuple-index building. The tuple-index makes it possible for searching using substrings in a string (if no tuple-index is used, only the beginning of the string is subjected to search). The goal is archived at the cost of indexing all possible substrings of a string. A few restrictions are related with the tuple-index (they are quite logical because of tuple-index specificity). The tuple-index cannot be a primary key, the tuple-index cannot be a unique index, and finally, the tuple-index may contain only one column (the column, of course, a text column). The structure JET_TUPLELIMITS is simple, and it contains the minimum and maximum length of substrings to index, and the maximum length of the target string to index. ESE uses this structure if JET_bitIndexTupleLimits is specified.
  • rgconditionalcolumn - pointer to the JET_CONDITIONALCOLUMN structure. The structure allows creating a "conditional" index. The conditional index contains only those entries (rows) in the table which satisfy the conditional index criteria. For example, all strings which are not NULL. For the time being, only two conditions are supported (their names speak for themselves): JET_bitIndexColumnMustBeNull and JET_bitIndexColumnMustBeNonNull.

Every table should have a primary key. However, the primary key may be omitted during table creation. If the primary key is missed, ESE will create it unnoticeably for the developer. The JET_INDEXCREATE structure contains an err field (like in the JET_COLUMNCREATE structure) for returning the error code (if an error occurs).

To summarize, let us look at code which creates a "TestTable" with two columns ("PK" and "Value") and two keys (primary key - "PK_index", and ordinary index "Value_index").

C++
JET_COLUMNCREATE columnCreate[2] = { 0 };

columnCreate[0].cbStruct = sizeof(JET_COLUMNCREATE);
columnCreate[0].szColumnName = "PK";
columnCreate[0].coltyp = JET_coltypLong;
columnCreate[0].grbit = JET_bitColumnAutoincrement;
columnCreate[0].err = JET_errSuccess;

columnCreate[1].cbStruct = sizeof(JET_COLUMNCREATE);
columnCreate[1].szColumnName = "Value";
columnCreate[1].coltyp = JET_coltypLongText;
columnCreate[1].cbMax = 1024;
columnCreate[1].grbit = JET_bitColumnTagged;
columnCreate[1].cp = 1200;
columnCreate[1].err = JET_errSuccess;

JET_INDEXCREATE indexCreate[2] = { 0 };
indexCreate[0].cbStruct = sizeof(JET_INDEXCREATE);
indexCreate[0].szIndexName = "PK_index";
indexCreate[0].szKey = "+PK\0";
indexCreate[0].cbKey =
static_cast< unsigned long >(::strlen(indexCreate[0].szKey) + 2);
indexCreate[0].grbit = JET_bitIndexPrimary;
indexCreate[0].err = JET_errSuccess;

indexCreate[1].cbStruct = sizeof(JET_INDEXCREATE);
indexCreate[1].szIndexName = "Value_index";
indexCreate[1].szKey = "+Value\0";
indexCreate[1].cbKey =
static_cast< unsigned long >(::strlen(indexCreate[1].szKey) + 2);
indexCreate[1].grbit = JET_bitIndexUnique;
indexCreate[1].err = JET_errSuccess;

JET_TABLECREATE tableCreate = { 0 };
tableCreate.cbStruct = sizeof(tableCreate);
tableCreate.szTableName = "TestTable";
tableCreate.rgcolumncreate = columnCreate;
tableCreate.cColumns =
sizeof(columnCreate) / sizeof(columnCreate[0]);
tableCreate.rgindexcreate = indexCreate;
tableCreate.cIndexes =
sizeof(indexCreate) / sizeof(indexCreate[0]);
tableCreate.tableid = JET_tableidNil;

err = ::JetCreateTableColumnIndex(sessionID, dbID, &tableCreate);
if(JET_errSuccess != err)
{
    throw CError(err);
}

Base Operations

Adding, Modifying, and Deleting

Adding, like modifying existing entries (rows) in a table, is performed using the JetPrepareUpdate/JetSetColumns/JetUpdate function triplet. Before describing JetSetColumns, we will take a look at the functions which start and stop data modification: JetPrepareUpdate and JetUpdate.

Using the JetPrepareUpdate function, the user specifies the operation's type:

C++
JET_ERR JET_API JetPrepareUpdate( 
    JET_SESID sesid, 
    JET_TABLEID tableid, 
    unsigned long prep 
);

prep defines the operation's type.

Three base operations can be executed:

  • JET_prepCancel - cancels any previous operation for the current cursor
  • JET_prepInsert - is used for inserting a new entry (row)
  • JET_prepReplace - is used for editing an entry (row) at which the current cursor points

There are three other operations which are variations of the base ones:

  • JET_prepInsertCopy - allows to insert a copy of an entry (row) at which the current cursor points
  • JET_prepInsertCopyDeleteOriginal - allows to insert a copy of an entry (row) at which the current cursor points and delete the original entry (row). The flag is used to modify the primary key.
  • JET_prepReplaceNoLock - the same as JET_prepReplace, but do not lock the current entry (row).

The data gets to the table after the JetUpdate function has been called.

C++
JET_ERR JET_API JetUpdate( 
    JET_SESID sesid, 
    JET_TABLEID tableid, 
    void *pvBookmark, 
    unsigned long cbBookmark, 
    unsigned long *pcbActual 
);

The function returns a bookmark. The bookmark is a normalized form of a primary key, and can later be used to position the created entry (row) using the JetGotoBookmark function. It is possible to save the bookmark in another table to organize references between tables (ESE does not support foreign keys).

OK, it is time to analyze the function which gets the data ready to insert.

C++
JET_ERR JET_API JetSetColumns( 
    JET_SESID sesid, 
    JET_TABLEID tableid, 
    JET_SETCOLUMN *psetcolumn, 
    unsigned long csetcolumn 
);

All necessary information is passed using the JET_SETCOLUMN structure.

C++
typedef struct 
{ 
    JET_COLUMNID columnid; 
    const void *pvData; 
    unsigned long cbData; 
    JET_GRBIT grbit; 
    unsigned long ibLongValue; 
    unsigned long itagSequence; 
    JET_ERR err; 
} JET_SETCOLUMN;

Along with fields whose purpose is clear, namely:

  • columned - column identifier (obtained during table creation; can be obtained using the JetGetColumnInfo function)
  • pvData - pointer to buffer
  • cbData - size of buffer
  • grbit - flag (basically, the flag applies for inserting and editing "long" data types and multi-valued columns)
  • err - error code (if an error has occurred)

There are two fields that should be explained. The first one, ibLongValue, allows reading "long" data types (JET_coltypLongBinary and JET_coltypLongText) piece by piece. This field contains the offset of the first byte in the column to be retrieved. It should be zero for other types. The second field, itagSequence, is intended for keeping the sequence number of edited values of multi-valued columns. The field should be zero if inserting into a multi-valued column has occurred. If the column is a single-valued column, the field should be zero, too.

Let us put all this together and write an example for inserting an entry (row) in our previously created table.

C++
err = ::JetPrepareUpdate(sessionID, tableCreate.tableid, JET_prepInsert); 
if(JET_errSuccess != err) 
{ 
    throw CError(err); 
} 

JET_SETCOLUMN setColumn = { 0 }; 
wchar_t szFirstRow[] = L"FirstInsertedRow"; 
setColumn.columnid = columnCreate[1].columnid; 
setColumn.pvData = szFirstRow; 
setColumn.cbData = sizeof(szFirstRow); 
setColumn.err = JET_errSuccess; 

err = ::JetSetColumns(sessionID, tableCreate.tableid, &setColumn, 1); 
if(JET_errSuccess != err) 
{ 
    throw CError(err); 
} 

err = ::JetUpdate(sessionID, tableCreate.tableid, 0, 0, 0); 
if(JET_errSuccess != err) 
{ 
    throw CError(err); 
}

To delete data, the JetDelete function is used. The function definition is very simple.

C++
JET_ERR JET_API JetDelete( 
    JET_SESID sesid, 
    JET_TABLEID tableid 
);

The function deletes the entry (row) at which the cursor points.

It is very interesting that the example above will fail. If we merge the table creation example and the entry insertion example, compile the result, and run it, the function JetSetColumns returns the JET_errNotInTransaction error code. It means - "The operation must occur within a transaction".

Let me explain why ESE demands wrapping the JetSetColumns call in a transaction. The reason is modifying the "long" data type (you remember, the "long" value is kept separately from the table). Therefore, (because of an error) without a transaction, it is possible the "long" value will not match the table state. To avoid such situations, ESE requires using transactions when "long" data types are modified. If JET_coltypText is the type of the first column (but not JET_coltypLongText, as in the example), we would avoid using transactions.

Since the conversation has turned to transactions, let us consider them in detail.

Transactions

The beginning, the end, and the transaction cancellation are expressed by the JetBeginTransaction, JetCommitTransaction, and JetRollback functions, respectively. The functions are quite easy to understand (just look at them):

C++
JET_ERR JET_API JetBeginTransaction( 
    JET_SESID sesid 
); 

JET_ERR JET_API JetCommitTransaction( 
    JET_SESID sesid, 
    JET_GRBIT grbit 
); 

JET_ERR JET_API JetRollback( 
    JET_SESID sesid, 
    JET_GRBIT grbit 
);

The functions are simple. The only interesting parameter, grbit, is described in MSDN (usually, it is zero). Nothing to add here. Therefore, we will discuss how transactions are implemented in ESE.

As I have already mentioned in the beginning of the article, the transaction mechanism is implemented using a snapshot technique. It is possible to represent a snapshot as a "freezing" of database content - when a session first enters a transaction, the database freezes. "Unfreezing" is performed when the transaction is completed (JetCommitTransaction) or canceled (JetRollback). If the transaction has been completed, all changes made during the transaction are inserted into the database.

Due to this mechanism, transaction does not use a read lock, because the session can read inside the transaction only those values which have been actually at the start of the transaction. Nevertheless, the write lock is performed all right. Suppose transaction A modifies some data. ESE takes into account this fact and does not allow transaction B to modify the same data. If transaction B tries to modify this data, it gets a JET_errWriteConflict error. If this error occurs, transaction B should be finished (completed or canceled). Then it is possible to retry data modification.

ESE holds snapshots as a set of different versions of pages modified by a transaction. We can consider this mechanism as a "copy-on-write" protection. Transaction A operates with the original pages in the database until it starts modifying data. When transaction A begins modifying data on a page, ESE replaces the page by a copy and marks the page as blocked for writing. If transaction B decides to modify the same page, it (transaction B) gets a JET_errWriteConflict error. If transaction A has been committed, ESE copies the modified pages into the database. Otherwise (if the transaction has been cancelled), ESE drops the modified pages. It is quite simple and logical. The pages amount used for the transaction can be managed by the JET_paramMaxVerPages "system" parameter.

Transactions can be nested. This fact adds nothing new: for every nested invocation of JetBeginTransaction, there should exist a nested call of JetCommitTransaction or JetRollback. A transaction is completed when the most outer call of JetBeginTransaction or JetRollback occurs. With one exception - it is possible to call JetRollback with the JET_bitRollbackAll flag. It means all inner transactions will be terminated.

I would like to add a few words concerning the use of transactions.

The first concerns thread dependence. By default, after a transaction has been started, all function calls that use the session identifier should be performed in the same thread where the JetBeginTransaction call has occurred. It seems this restriction is not essential. In addition, the restriction can be avoided using the JetSetSessionContext and JetResetSessionContext functions.

The second feature is more interesting. It appears that code with transaction may be executed faster than code without transaction. The point is that ESE wraps in the transaction any function reading or modifying data if no currently active transaction is presented in the session. Of course, this can affect application performance. Therefore, MSDN recommends using transactions when an application executes the mentioned operations.

So, back to our example. The correct version (this version does not return JET_errNotInTransaction) looks like this:

C++
err = ::JetBeginTransaction(sessionID); 
if(JET_errSuccess != err) 
{ 
    throw CError(err); 
} 

err = ::JetPrepareUpdate(sessionID, tableCreate.tableid, JET_prepInsert); 
if(JET_errSuccess != err) 
{ 
    ::JetRollback(sessionID, 0); 
    throw CError(err); 
} 

JET_SETCOLUMN setColumn = { 0 }; 
wchar_t szFirstRow[] = L"FirstInsertedRow"; 
setColumn.columnid = columnCreate[1].columnid; 
setColumn.pvData = szFirstRow; 
setColumn.cbData = sizeof(szFirstRow); 
setColumn.err = JET_errSuccess; 

err = ::JetSetColumns(sessionID, tableCreate.tableid, &setColumn, 1); 
if(JET_errSuccess != err) 
{ 
    ::JetRollback(sessionID, 0); 
    throw CError(err); 
} 

err = ::JetUpdate(sessionID, tableCreate.tableid, 0, 0, 0); 
if(JET_errSuccess != err) 
{ 
    ::JetRollback(sessionID, 0); 
    throw CError(err); 
} 

err = ::JetCommitTransaction(sessionID, 0); 
if(JET_errSuccess != err) 
{ 
    ::JetRollback(sessionID, 0); 
    throw CError(err); 
}

Similarly, I will insert three different values into the table - "SecondInsertedRow", "ThirdInsertedRow", and "FourthInsertedRow". We are going to use them in the next section.

Data Reading

Let us look at how to read data from a table. We will start from the simplest case - sequential reading of all entries (rows) from a table. To read data, we should associate the current cursor with a table index. When we do this, we are able to navigate through the table using the current cursor. The Association is performed using the JetSetCurrentIndex function:

C++
JET_ERR JET_API JetSetCurrentIndex( 
    JET_SESID sesid, 
    JET_TABLEID tableid, 
    const char *szIndexName 
);

where szIndexName should be equal to "PK_index" or "Value_index" - one of the indexes we have created before.

When the link between the index and the cursor is established, it is possible to navigate using the JetMove function:

C++
JET_ERR JET_API JetMove(
    JET_SESID sesid,
    JET_TABLEID tableid,
    long cRow,
    JET_GRBIT grbit
);

And read data using the JetRetrieveColumn function:

C++
JET_ERR JET_API JetRetrieveColumn(
    JET_SESID sesid,
    JET_TABLEID tableid,
    JET_COLUMNID columnid,
    void *pvData,
    unsigned long cbData,
    unsigned long *pcbActual,
    JET_GRBIT grbit,
    JET_RETINFO *pretinfo
);

Both functions (for moving and reading) are simple ones. I will not dwell on them. I have to point out that JetMove can navigate the cursor to both relative (for example, next entry) and absolute (first or last) positions. The JetRetrieveColumn function rules the nuances of reading with the help of the grbit flag (the nuances are too specific to discuss in detail). In addition, grbit and pretinfo are used for reading multi-valued and "long" data types.

Let us look at the result we get when we put it all together. Below is an example of a sequence reading from our table, using "Value_index":

C++
err = ::JetSetCurrentIndex(sessionID, tableCreate.tableid, "Value_index");
if(JET_errSuccess != err)
{
    throw CError(err);
}

for(err = ::JetMove(sessionID, tableCreate.tableid, JET_MoveFirst, 0);
    JET_errSuccess == err;
    err = ::JetMove(sessionID, tableCreate.tableid, JET_MoveNext, 0))
{
    unsigned long nPK = 0;
    unsigned long nReadBytes = 0;
    err = ::JetRetrieveColumn(sessionID,
        tableCreate.tableid,
        columnCreate[0].columnid,
        &nPK,
        sizeof(nPK),
        &nReadBytes,
        0,
        0);
    assert(nReadBytes == sizeof(nPK));

    std::wcout << nPK;

    if(JET_errSuccess != err)
        break;

    wchar_t buffer[1024] = { 0 };
    err = ::JetRetrieveColumn(sessionID,
        tableCreate.tableid,
        columnCreate[1].columnid,
        buffer,
        sizeof(buffer),
        &nReadBytes,
        0,
        0);
    assert(nReadBytes < sizeof(buffer));

    std::wcout << L'\t' << buffer << std::endl;
}

if(JET_errNoCurrentRecord != err)
{
    throw CError(err);
}

If we analyze the console output when the program is compiled and executed, we may ensure the values have been read according to the increase in the "Value_index".

1 FirstInsertedRow
4 FourthInsertedRow
2 SecondInsertedRow
3 ThirdInsertedRow

Of course, just reading the rows one after another is not the most useful thing we might want to do. I believe searching is used more often. The search is supported by the JetSetCurrentIndex, JetMakeKey, JetSeek and JetSetIndexRange functions.

We have already met the first one. The JetSetCurrentIndex function allows to specify the name of the current index. Let us consider three other functions.

The JetMakeKey function sets a key (criteria) which is used during the data search. The key is linked with the current index and the current cursor (it means the JetMakeKey function should be called after the JetSetCurrentIndex function has been called). It is possible using this function to specify values for every column which forms the index.

C++
JET_ERR JET_API JetMakeKey(
    JET_SESID sesid,
    JET_TABLEID tableid,
    const void *pvData,
    unsigned long cbData,
    JET_GRBIT grbit
);

The parameters are simple enough:

  • The pvData buffer contains the value for the current column of the index. The value data type should exactly correspond to the column data type - ESE does not perform any type casting.
  • The buffer length is passed using the cbData parameter.
  • The grbit flag makes it possible to create a new key or continue creating the current one. Besides, the flag allows specifying the search criteria. The documentation is rather vague (and perhaps incorrect) about this flag's values. Experiments demonstrate that the JET_bitStrLimit, JET_bitSubStrLimit, JET_bitPartialColumnStartLimit, and JET_bitPartialColumnEndLimit flags affect the JetSetIndexRange function behavior (we will see this later).

The JetSeek function positions the current cursor at an entry (if exists) which satisfies the search criteria.

C++
JET_ERR JET_API JetSeek(
    JET_SESID sesid,
    JET_TABLEID tableid,
    JET_GRBIT grbit
);

The grbit parameter indicates the way in which positioning will occur. There are a few variants; I will describe the more oftenly used ones:

  • JET_bitSeekEQ allows positioning the cursor "at the index entry closest to the start of the index that exactly matches the search key" (quotation from MSDN). In other words, this value is the most suitable when a primary key is used for search.
  • Besides the JET_bitSeekEQ value, the JET_bitSeekGE, JET_bitSeekGT, JET_bitSeekLE, and JET_bitSeekLT values (with self-describing names) can be used. Similarly, positioning is occurred at an entry closest to the start of the index that matches the criteria (Greater or Equal, Greater Than, Less or Equal, Less Than).
  • The JET_bitCheckUniqueness flag allows validating if only one entry satisfies the search key.
  • JET_bitSetIndexRange - we will discuss this flag later.

The JetSeek function returns JET_errSuccess if it is possible to seek at an entry according to the search key. The function returns JET_errNoCurrentRecord if no entry has been found.

OK, let us examine how to position at an entry of "Value_index" starting with "Fo" (i.e., at the second entry of this index).

C++
wchar_t bufferSearchCriteria[] = L"Fo";
err = ::JetMakeKey(sessionID,
    tableCreate.tableid,
    bufferSearchCriteria,
    sizeof(bufferSearchCriteria),
    JET_bitNewKey);

if(JET_errSuccess != err)
{
    throw CError(err);
}

for(err = ::JetSeek(sessionID, tableCreate.tableid, JET_bitSeekGE);
    !(err < 0);
    err = ::JetMove(sessionID, tableCreate.tableid, JET_MoveNext, 0))
{
    unsigned long nPK = 0;
    unsigned long nReadBytes = 0;
    err = ::JetRetrieveColumn(sessionID,
        tableCreate.tableid,
        columnCreate[0].columnid,
        &nPK,
        sizeof(nPK),
        &nReadBytes,
        0,
        0);
    assert(nReadBytes == sizeof(nPK));

    std::wcout << nPK;

    if(JET_errSuccess != err)
        break;

    wchar_t buffer[1024] = { 0 };
    err = ::JetRetrieveColumn(sessionID,
        tableCreate.tableid,
        columnCreate[1].columnid,
        buffer,
        sizeof(buffer),
        &nReadBytes,
        0,
        0);
    assert(nReadBytes < sizeof(buffer));

    std::wcout << L'\t' << buffer << std::endl;
}

if(JET_errNoCurrentRecord != err)
{
    throw CError(err);
}

The example will print entries according to the increase of the "Value_index" beginning from the first found:

4 FourthInsertedRow
2 SecondInsertedRow
3 ThirdInsertedRow

It is obvious that we limited the sample as we planned - the first found entry is an entry starting with "Fo".

To find nothing but only those entries in the "Value" column starting with "Fo", it is necessary to have another function call. So, it's time for the JetSetIndexRange function.

C++
JET_ERR JET_API JetSetIndexRange(
    JET_SESID sesid,
    JET_TABLEID tableid,
    JET_GRBIT grbit
);

This function forms a temporary range of index values, where one limit of the range is the current position of the current index. The second limit is the last value that satisfies the search criteria. The way in which the bounds are formed depends on the grbit flag. There are four available values described in MSDN. To avoid exact quotations from MSDN, I will describe only two of them:

  • The JET_bitRangeInclusive flag specifies if the bounds should be included in the range.
  • The JET_bitRangeUpperLimit flag should be used when the search criteria (created by the JetMakeKey function) implies finding entries closest to the end of the current index. That is what we need when trying to limit the previous sample.

The two others allow deleting a previously formed index and testing (only) if the specified range exists or not. They are described in MSDN in detail.

Once the range has been formed, it is possible to pass through the range using the JetMove function with JET_MoveNext or a positive value as the cRow parameter (if the JET_bitRangeUpperLimit parameter has been used for the JetSetIndexRange call; if it is not true, the range should be iterated in reverse order; in this case, JET_MovePrevious or a negative value should be passed as the cRow parameter.) The JetMove call using any other parameter will lead to the current cursor leaving the temporary range of index values.

C++
err = ::JetMakeKey(sessionID,
    tableCreate.tableid,
    bufferSearchCriteria,
    sizeof(bufferSearchCriteria),
    JET_bitNewKey | JET_bitPartialColumnStartLimit);

if(JET_errSuccess != err)
{
    throw CError(err);
}

// set the current cursor to the "FourthInsertedRow" entry
err = ::JetSeek(sessionID, tableCreate.tableid, JET_bitSeekGE);
if(err < 0)
{
    throw CError(err);
}

err = ::JetMakeKey(sessionID,
    tableCreate.tableid,
    bufferSearchCriteria,
    sizeof(bufferSearchCriteria),
    JET_bitNewKey | JET_bitPartialColumnEndLimit);
if(JET_errSuccess != err)
{
    throw CError(err);
}

// form the temporary range
for(err = ::JetSetIndexRange(sessionID,
                             tableCreate.tableid,
                             JET_bitRangeInclusive | JET_bitRangeUpperLimit);
    !(err < 0);
    ::JetMove(sessionID, tableCreate.tableid, JET_MoveNext, 0))
{
    unsigned long nPK = 0;
    unsigned long nReadBytes = 0;
    err = ::JetRetrieveColumn(sessionID,
        tableCreate.tableid,
        columnCreate[0].columnid,
        &nPK,
        sizeof(nPK),
        &nReadBytes,
        0,
        0);

    if(JET_errSuccess != err)
        break;
    assert(nReadBytes == sizeof(nPK));

    std::wcout << nPK;

    wchar_t buffer[1024] = { 0 };
    err = ::JetRetrieveColumn(sessionID,
        tableCreate.tableid,
        columnCreate[1].columnid,
        buffer,
        sizeof(buffer),
        &nReadBytes,
        0,
        0);
    assert(nReadBytes < sizeof(buffer));

    std::wcout << L'\t' << buffer << std::endl;
}

I would like to turn your attention to the following points:

  • It is necessary to invoke the JetMakeKey function repeatedly before the JetSetIndexRange call. (The first call has been performed before the JetSeek call). The reason is that JetSeek (JetSetIndexRange also) removes the key created by the previous call to JetMakeKey.
  • The second JetMakeKey call happens with the bitPartialColumnEndLimit parameter. This value defines the algorithm which forms the range. The documentation explains, "this option should be used when building wildcard search keys to use for finding index entries closest to the end of an index". By intuition, it is clear how this option works - after JetSeek returns control, the cursor should be positioned at the farthest entry from the beginning of the index. The same behavior we observe when JetSetIndexRange has been called - the second range bound is set closer to the end of the current index.
  • Although MSDN says that JetSetIndexRange temporary limits and the range can be iterated using sequential calls to the JetMove function, it is not correct. The range limits JetRetrieveColumn calls, not JetMove calls. In the example above, the loop is interrupted after the JetRetrieveColumn call, not after JetMove (as we might suppose).

The example above will display this line:

4 FourthInsertedRow

If you plan to use full text search, you will need the special index type - tuple-index (described above).

Now we return to the JET_bitSetIndexRange flag for the JetSeek function (as it has been promised). When we call JetSeek using the JET_bitSetIndexRange flag, it can be considered as another way to perform search. However, there are differences from the way that uses the JetSetIndexRange function.

  • The strict search alone works. It means the JET_bitSetIndexRange flag can be combined with the JET_bitSeekEQ flag only. An attempt to use JET_bitSetIndexRange without JET_bitSeekEQ will fail. In my version of ESE, JetSeek returns a confused return code 313 (a positive return code indicates a warning; unfortunately, esent.h contains no such warning, so I am afraid we will never know what this warning means). The subsequent calls to JetMove works incorrectly.
  • We do not need the preliminary call to JetSeek anymore to set the cursor at the beginning of the sample. The JetSeek call (with the JET_bitSetIndexRange parameter) forms the range at once.
  • The JetMakeKey call should be executed without specifying the JET_bitStrLimit, JET_bitSubStrLimit, JET_bitPartialColumnStartLimit, and JET_bitPartialColumnEndLimit parameters. That is clear enough. As the search is the result of exact fit, there is no necessity to specify the range bounds.

If we gather all the pieces of the code together, compile the result, and run it, we will find a few files in the working directory of the created application. What these files are and what these files are for, we will get to know from the next section.

What Is the Database Made Of

ESE uses a few files for service. If you execute the dir command in the directory where our example has run, you will find the following files:

01.07.2006 20:27 8 192 edb.chk
01.07.2006 20:27 5 242 880 edb.log
01.07.2006 20:26 5 242 880 res1.log
01.07.2006 20:26 5 242 880 res2.log
01.07.2006 20:27 1 056 768 test.db

Database File

The only file whose purpose we may guess - the test.db file - we passed this name to the JetCreateDatabase function.

The file keeps:

  • Schema describing tables
  • Table indices
  • The data in tables

The file is one-megabyte size. It will grow as new information will be inserted. The value of increase can be specified using the JET_paramDbExtensionSize "system" parameter (the parameter is set using the JetSetSystemParameter function).

Transaction Log File

The next file we will discuss is the transaction log file. It is named edb.log in our case (the name can be changed using the JET_paramBaseName "system" parameter).

The transaction log file contains the "operations on the database files" (quotation from MSDN). The file contains "enough information to bring a database to a logically consistent state after an unexpected process termination or system shutdown" (quotation). It means, in case of a fault of the application or the Operating System, power supply, etc., the transaction file will keep a list of operations which should be replayed in order to bring the database to a consistent state. This procedure is called "soft recovery". ESE performs the procedure during the JetInit/JetInit2/JetInit3 calls.

We see a single file, but MSDN refers to it in plural. The fact is that a number of files may grow on-stream. Let us look at this process closer.

The current transactions are written to the current transaction log file - edb.log. When the current transaction log reaches five megabytes, ESE renames it to edb000001.log and creates a new transaction log file - edb.log, again. When this log file is full, ESE renames it to edb000002.log and so on. Every transaction file has a fixed size (five megabytes by default, this value can be changed using the JET_paramLogFileSize parameter).

According to this algorithm, it is impossible to say which of the edb00000*.log files contains the actual operation data (they are necessary to recover the database if a failure occurs). Or more exactly, it is hard to say which file does not contain the actual operation data. This is the reason why we ought not to move, delete, rename, or manipulate any of these files.

The transaction file can be safely removed in the following scenarios:

  1. While a full database backup is performed (using one of these functions - JetBackup, JetTruncateLog, JetTruncateLogInstance).
  2. When circular logging is enabled. If this option is turned on (JET_paramCircularLog "system" parameter), unnecessary log files are deleted auto-magically.

Reserved Transaction Log Files

The res1.log and res2.log files are transaction log files too, but they are not ordinary ones, they are reserved. During engine initialization, ESE creates these files to ensure the fulfillment of "clean shutdown".

What does it mean - "clean shutdown"? Let me describe this. It may happen that an application is starting in out of disk space conditions. ESE cannot create a new transaction log file in this case. The only thing it can do is detach the database and shut down. However, to execute this operation in a "clean" fashion, ESE may require additional disk space (for example, to write rollback for current operations). The transaction log file will be used in this situation.

Checkpoint File

The file edb.chk is the checkpoint file. Let us examine this file's purpose.

There is an interesting feature in ESE - any operation with the database is written to the transaction log file and persisted in memory. (Here is another interesting characteristic - the operations on the database may not be in the same order as they are written to the transaction log; according to MSDN, this increases efficiency).

Thus, the transaction log can contain both operations that are already in the database file and operations that are not in the database file yet. The checkpoint - it is the time label marking the exact place in the transaction log file. All information before this place are already in the database file. It is unknown if the information after this point is in the database or not.

The edb.chk file (along with the transaction log files) is used during the "soft recovery" procedure.

The path to the checkpoint file can be set using the JET_paramSystemPath "system" parameter.

Indexing of Unicode Strings

It is interesting to investigate how ESE operates indices based on Unicode strings (or just Unicode-indices, for short). As I mentioned above, there is no problem if Unicode strings are inserted into or deleted or read from a table. Are there any problems related to Unicode-indices?

However, before answering, I will mention how ESE keeps secondary indices. ESE holds them in separate structures, and allows creating and deleting a secondary index after a table has been created. This is not true regarding primary indices. Primary indices are created during table creation and cannot be deleted.

OK, now we may proceed. I would remind that ESE uses LCMapString to build a Unicode index. Although there is a side effect - the result of this function may differ for different versions of the Windows Operating System. Since ESE uses LCMapString for creating a key for sorting, it means a key created for the Windows 2000 Operating System may differ from a key created for Windows XP or Windows 2003. This also means a Unicode-index created on Windows 2000 may not work correctly on Windows XP. The most likely results of using an index on the "wrong" Operating System are impossibility of sorting by using this index and incorrect insertion in tables.

This problem would not be a big one for many applications. Nevertheless, the situation is worse than it may seem at first sight. LCMapString can return different results even on the same version of the Windows Operating System. It happens because the function behavior can be changed after a regular update or after a service pack is installed.

Does it imply we should reject using Unicode-indices? The short answer is no, we should not.

The full answer is:

  • We should not use a Unicode string as the primary key.
  • If secondary Unicode-indices are used, it is necessary to perform special actions before attaching a database.

Let us start from the second point. The developer may ask ESE to validate the NLS version (National Language Support library - this library is responsible for LCMapString) used by the Operating System. If it has been asked, ESE checks the NLS version while executing JetAttachDatabase. If the current NLS version is newer than the NLS version used for Unicode-indices creation, JetAttachDatabase returns an error. If functions return error, it is necessary to recreate the Unicode-indices using the new version of the NLS library.

Let us discuss these steps in detail. By default, JetAttachDatabase does not validate the NLS version. In order to force performing this checkup, the developer should set the JET_paramEnableIndexChecking "system" parameter to true before JetInit is called. If the parameter has been set appropriately and JetAttachDatabase finds outdated indices, it returns one of the following:

The first error code tells us we should recreate the secondary Unicode-indices. The last one informs us a primary key has become invalid, which cannot be recreated. Exactly because of this error code, there is a restriction for using Unicode strings as primary keys.

But a secondary index could be easily recreated - it is a simple operation. First, the index should be deleted by the JetDeleteIndex function and then created by JetCreateIndex2. The task of index deletion can be delegated to ESE. To archive this goal, it is necessary to pass the JET_bitDbDeleteCorruptIndexes flag during the JetAttachDatabase call. The damaged indices will be deleted, and the function will return the JET_wrnCorruptIndexDeleted value. We should just recreate the indices.

MSDN describes another way to manipulate outdated Unicode-indices. However, this is described carelessly and inconsistently. Documentation asserts that it is possible to delete damaged indices automatically if the developer has set the JET_paramEnableIndexCleanup parameter to true before the JetInit call.

The questions appear after reading the phrase "the database engine may automatically clean up indexes over Unicode key columns in JetInit". It is not clear if the phrase implies the engine may not clean up indices. The next paragraph clarifies the situation. It informs us an "incremental cleanup" is not always possible (unfortunately, it is not clear from the context what "incremental cleanup" means). If a cleanup is not possible, "the index will be handled as prescribed by JET_paramEnableIndexChecking". Does it mean if a cleanup is not possible, we should do the same thing we do when using JET_paramEnableIndexChecking? If yes, why should we use the JET_paramEnableIndexCleanup parameter? Only questions, no answers.

The commentary in esent.h clarifies the purpose of the JET_paramEnableIndexCleanup parameter. The commentary says - "If JET_paramEnableIndexCleanup is set, the internal fix up table will be used to fix up index entries. This may not fix up all index corruptions but will be transparent to the application." We may only guess what this means. My supposition is that the JET_paramEnableIndexCleanup flag is an attempt to repair the NLS problem "on the fly" and transparent to the application. The attempt, it seems, was not successful ("this may not fix up all index corruptions"), but we know that a solution exists (see the two paragraphs above).

Conclusion

In conclusion, it is possible to say that Microsoft gave us a very successful library. The main advantages of the library are:

  • Efficient engine, which supports huge amounts of data.
  • Lightness of the library, which can be used as an in-process service.
  • Support for different versions of Windows (starting from Windows 2000).
  • Full-blown mechanism for crash and error recovery, which allows the application to rely upon data consistency after an unexpected process termination or system shutdown.
  • Reliability of the library that shows itself in such products as Active Directory and Exchange 2000.
  • Availabilty to use the library in C# - the http://managedesent.codeplex.com/ project provides interop binary to operate with ESE NT (thanks to Thomas Krojer for this information).

As disadvantages, I should point out:

  • Perhaps, greater than usual cost of development (compared to using the usual RDBMS).
  • The absence of "real" RDBMS services - triggers, Stored Procedures, foreign keys, security, etc.

Objectivity requires noticing this technology is not a unique one. There are lightweight engines which may run as in-process services. For example - Firebird and SQLLite. Though I can say nothing about the first one, I know at least one example of successful deployment of the second one (i.e., SQLLite) in a commercial project. The comparison of listed engines deserves separate consideration, of course.

It seems, in the first place, ESE will awake the interest of creators of server applications which require reliable storage and which cannot use a regular RDBMS for some reasons. Nevertheless, it is possible that the technology will be interesting for desktop application developers and will be useful for them.

References

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)