Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / database

iDb2Connection - How to cancel a long running query

4.33/5 (3 votes)
7 Jun 2010CPOL12 min read 57.3K   601  
Describes how you can run multiple queries against an iSeries DB2 and cancel long running queries.

Introduction

An application we've been working on recently is using an iSeries as its datasource. The front end of the application is basically a friendly query designer, showing screens that allow the users to pick parameters from various drop downs and submit the information for retrieval. The iSeries libraries are extremely large, it's a data warehouse that records all stock and sales information going back over 10 years.

By it's very nature, some of the queries that can be submitted may take a while to run. Even though the warehouse files are indexed, denormalised, and we've tuned many of the possible SQL statements, someone may decide they want to see 'every till transaction for every store in the last 6 months'.

We wanted administrator functionality in the application that would allow the user to cancel the query running on the iSeries, freeing all database resources used by the job.

Background

Since many of the queries would return the same results (e.g., a sales rep requests to see yesterday's top 100 sellers), we decided to aggressively cache data on the application server. Data is compiled into the data warehouse files when 'end of day' jobs run on the iSeries so we know that the data is static during office hours. Once a person has run a particular query, the same results should be retrieved from the cache and served to subsequent requests.

However, we still needed to consider how to handle the initial database operation. This type of data access will have to be performed asynchronously, and some additional project requirements were also needed to be considered in our design.

  • A dedicated .NET application server will listen to query requests from clients.
  • The application server will run the SQL query asynchronously against the iSeries database.
  • A 'Query Manager' screen will be available to administrators of the system in the UI, showing them all of the running queries currently sitting in the application server. This should include associated iSeries job details and query durations.
  • Administrators should be able to cancel queries on the iSeries, freeing database resources.
  • The client will be informed when the query has finished processing or if there are any exceptions due to query cancellation.

The SqlClient provider has the BeginExecuteReader method that allows for asynchronous access, but unfortunately, this is not available in the current version of the IBM provider. So we looked at how we could implement this functionality ourselves, paying particular attention into the ability to monitor and cancel queries.

How does the iSeries process a query?

Before we go into the cancellation of a query, it's important to understand how the iSeries handles TCP/IP requests. Any of you familiar with MS SQL will know that an active connection will have an associated process; executing sp_who2 in query analyser will give you a list of all those processes.

A similar concept is used in the iSeries environment; in this case, the unit of work is processed by an iSeries job. Here is the IBM documentation.

"Each ODBC connection communicates with one system database program. This program is referred to as the host server program. The name of the database server program used with TCP/IP is QZDASOINIT."

Depending on which provider you've used to connect to an iSeries, a job associated with your connection is allocated in a particular subsytem on the iSeries. Different providers will result in different subsystem jobs, but for the scope of this article, any connection using the IBM .NET provider will be allocated jobs from the QUSRWRK subsystem and will appear as a QZDASOINIT job.

N.B. This does depend on the configuration of your iSeries. You can configure the QZDASOINIT jobs to run in different subsystems and your system admin may have chosen to do so. The most important thing to note is the name of the server program (QZDASOINIT) which will be the same regardless of the underlying subsystem.

The following screenshot shows jobs running in the QUSRWRK subsystem after issuing the WRKACTJOB command:

jobs.jpg

In our environment, these jobs are configured as prestart jobs and listen for incoming TCP/IP connections. When you open a connection from .NET, you will be connecting to a job running the QZDASOINIT program.

The following code should show you how a job is allocated to an iDB2Connection:

C#
using (iDB2Connection connection = new iDB2Connection("Your-Connection-String"))
{
   /* When we open the connection, we are allocated a job on the AS400
    * that will handle our SQL request. 
    * 
    * e.g 219418/QUSER/QZDASOINIT
    * 
    */
    try
    {
        connection.Open();
        Console.WriteLine(connection.JobName);
    }
    catch (Exception ex)
    {
        Console.WriteLine(ex.Message)
    }
}

In the above code snippet, after the call to Open(), the connection object will now have job details associated with it. Try putting a breakpoint on Console.WriteLine() so your connection object is kept active. Now you can the run the following commands on the iSeries to verify the connection while your client has the connection open.

WRKJOB JOB(QZDASONIT)

This will display all the prestart jobs currently listed in the system. A useful command, but we can't easily tell which one is ours:

WRKJOB JOB(job_number/job_user/job_name)

This allows you to view the specific job you are connected to. The parameters required to run this command are available in your .NET code. The connection.JobName property has the exact syntax, so get the details and enter them into the command. You can now view the job that is hosting your .NET TCP/IP connection.

When you run a SQL statement against the connection object, the job you have been allocated will then process the SQL as an interactive process.

The iSeries has excellent support for submitting queries to batch and choosing particular queues to process the batch jobs. You can tweak the system settings to prioritize certain jobs and queues, something I had hoped to take advantage of from the .NET environment. The iSeries also has some software available that can prevent queries running before they are executed (check out the Query Governor article).

We looked into many different approaches to this, but ran into numerous problems working with these from .NET, which I'll possibly discuss in a future article!

OK, so how can I cancel a long running query?

Once you've a handle on the iSeries job that is processing your SQL statement, you can interact with that job and issue further commands against it. Within our .NET code, the execution of the query is happening in a new thread, so we need to record the iSeries job details when we open a connection and before we start executing SQL.

Consider the following workflow:

  • UI submits a query
  • Our query manager (.NET code) spawns thread B that opens a connection
  • The name of the job assigned to the connection is noted and reported back to the caller
  • UI stores job details, it now knows that a thread has connected to the iSeries and the job details associated with that thread
  • UI now has all details required to issue a cancel query request
  • Query manager executes cancel request interactively against the job details
  • iSeries ends the specified job
  • A communication error is reported back to thread B that was processing the SQL asynchronously
  • Handle exception, and report back to UI that the query was cancelled

We can now do this quite simply by submitting an ENDJOB command to the iSeries. Submitting commands against the iSeries requires a particular type of command syntax; here's a helper function to achieve this:

C#
private string GetDb2CommandString(string commmand)
{
    string commandLength = commmand.Length.ToString("0000000000", 
                           System.Globalization.CultureInfo.CurrentCulture);   
    return String.Format("CALL QSYS.QCMDEXC('{0}', {1}.00000)", 
                         commmand, commandLength);
}

We can build a valid ENDJOB command by using the command string formatter and executing the statement against an iDb2Connection object. In the below example, iDB2Helper is a port of the old SQLHelper class that Microsoft released prior to their Data Access Application Block. I'm using it in these examples because it's easier to include in the demo project.

C#
public void CancelQuery(string jobName)
{
    string commandText = this.GetDb2CommandString("ENDJOB JOB(" + 
                         jobName + ") OPTION(*IMMED) SPLFILE(*YES)");
    iDB2Helper.ExecuteNonQuery(connectionString, 
               System.Data.CommandType.Text, commandText);
}

This command runs synchronously; the iSeries will end the job immediately (*IMMED) and terminate your SQL retrieval running on thread B. This causes an exception to be thrown back to the caller.

We've now killed the prestart job that was handling the TCP/IP SQL request. This is a good thing; we've freed up the database operation itself (maybe it contained problem SQL, perhaps it was causing issues in other jobs etc.), but it's good to understand exactly what we've done and any implications this could cause.

Prestart jobs are just that, they are automatically started when the subsystem begins. The amount of jobs you have is dependant on your system configuration. They are kept alive and reused as new TCP/IP requests are received. Like most things on the iSeries, the amount of times a job is reused is configurable by a system admin. This is because creation of a new job is a more resource intensive operation than reusing an existing job (think of connection pooling within .NET).

For further reading on this, check out tuning prestart job entries.

Killing these jobs will therefore incur an overhead at some point. The good news is the iSeries is smart enough to know how many jobs it has remaining, and only create new ones when all prestart jobs are busy processing other requests, so your overhead is delayed until some unlucky client tries to run some SQL that forces the iSeries to create a new job.

This performance hit is negligible in my option; since we're running the query in a 'fire and forget' manner anyway, the client isn't tied up and won't really notice the additional work that has gone on behind the scenes.

Important points

Because we are taking control of the data retrieval into our own thread, there are a couple of extra connection string attributes that you need to specify in order to correctly handle the exception that is reported back when you cancel the iSeries job.

DATA SOURCE=MYISERIES;USER ID=USR;PASSWORD=PWD;
     CheckConnectionOnOpen=true;EnablePreFetch=false

CheckConnectionOnOpen: Not entirely necessary, but always a useful one to use in your connection string. It addresses some connection pooling issues that you can run into when an iSeries recycles many or all jobs (e.g., during an IPL). Full details can be found here.

EnablePreFetch. You must specify this attribute; otherwise, you will encounter a fatal exception. Prefetching data is enabled by default in the iSeries provider for performance purposes. Internally, the provider will start a thread that begins the data retrieval. If you don't switch this off, when we cancel the iSeries job the exception associated with this communication error will occur in the provider instead of our custom code and cause your application to crash.

Very important!

The technique I'm demonstrating shouldn't be used as a hammer for all problematic QZDASOINIT jobs. If you're consistently seeing poor performance, then the SQL you are trying to run is probably flawed and causing unnecessary work for the iSeries.

Always tune and analyse your SQL as a first step, and look at the execution plans for the query. Are you using indexes or causing table scans? Can you make your SQL more efficient?

As mentioned, we're including the cancel query in our project as an administrator-only function in the UI.

Demonstration Project

That's pretty much it - you should now know how the iSeries is running your TCP/IP requests, how this is interpreted in the .NET code, and how we can use this information to further interact with the iSeries.

I've put together a demo project that ties together all the different concepts which you can download from the link at the top of the article.

The project is fairly straightforward; it demonstrates the ability to submit multiple queries to a 'Query Manager' that will process each SQL statement against its own connection object (and subsequently, its own QZDASONIT job). Query status is displayed in the bottom of the screen and any errors that occur will be displayed there.

Image 2

The following text is also included in the project download file as ReadMe.txt:

Asynchronous data access from iSeries

The user interface is a simple screen to allow you to submit SQL to your iSeries and monitor the results.

The screen is split into two sections:

  • Main area: This is a grid that will allow you to view the results when your query completes
  • Bottom section: This is the query monitor screen that displays your running queries and all associated details

Before you run the program

There's only one thing you should need to configure for your environment. In the App.config file, specify a connection string for your iSeries.

I am referencing the IBM.Data.DB2.iSeries provider in C:\Program Files\IBM\Client Access\IBM.Data.DB2.iSeries.dll. Check that the reference in the 'DataAccess' project has been identified OK. If you have V5R3 of the IBM iSeries installed, it should be fine.

How to submit a query

There are two ways to submit a query, both available from the 'Queries' menu.

  1. Open SQL. This allows you to open any txt\sql files that have predefined statements, a quick way to run a query.
  2. Manual SQL entry. Enter a valid SQL command and give it a brief description.

Once you've completed either of above, the query will be submitted and will appear in the query monitor section of your screen.

How to cancel a query

If you have a query running in the query monitor, right click to view the context menu and click Cancel Query.

A command will be sent to the iSeries and your query will move onto error status.

How to view query results

Once your query has completed, it will show as green in the query monitor. Double click on the item and the results will be loaded into the main screen section.

Testing multiple queries

If you want to test many queries running at once, it's best to use method (i) for submitting a query. Setup a few SQL files that will run for varying durations. E.g., create a SQL statement you know may take at least 30 seconds to run for one file. This will give you a chance to cancel.

  • Set up a few text files with some SQL you want to run, say 3 or 4 different queries.
  • Repeat method (i) to submit all the text files to the iSeries.
  • Each query will display in the query monitor with a different iSeries job allocated.
  • Try cancelling the long running query.

Issues

Careful with how many rows your SQL will return, you can use up a lot of memory. Queries are added to Set in the UI to allow us to get at the DataSet results. Use the 'Remove Query' option to clear the items from the list and then cache when you've finished with them.

Minimum requirements

  • Visual Studio 2005
  • iSeries .NET provider (V5R3 or better)
  • Access to command line on a green screen will be useful, but not required

History

  • June 7, 2010 - Initial release.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)