Ansychronous triggers and cursors

h_wiedey

4.43/5 (6 votes)

22 Nov 2013CPOL5 min read

18.3K

Asynchronous processing of triggers using cursors

Introduction

In Relational Database Management Systems (RDBMS) all triggers are synchronous in the sense that the next statement after the triggering statement is only executed after the execution of the trigger has completed.

There are however situations where an asynchronous execution can be a valid option. This article shows how cursors can effectively be used in implementing asynchronous triggers.

Cursors in Triggers

To start with, it is not required to use a cursor in a trigger to loop over all table rows affected by a statement. It is a common misconception even among professional MS-SQL (and Sybase) developers that a cursor needs to be used when more than one row is affected by a triggering statement. The below example shows how a simple set based trigger works without using a cursor.

SQL

--The below SQL implements a simple set based trigger
USE sandbox
GO
 
CREATE TABLE Numbers(
    ID int NOT NULL,
    Number int NOT NULL,
    CurrentNumber int NULL,
    PreviousNumber int NULL,
    CONSTRAINT pk_Numbers PRIMARY KEY (ID)
)
GO
 
CREATE TRIGGER tiu_Numbers ON Numbers FOR INSERT, UPDATE
AS
BEGIN
    PRINT 'TRIGGER tiu_Numbers: Updating Numbers.CurrentNumber from inserted'
    UPDATE Numbers 
    SET CurrentNumber = i.Number
    from Numbers n
    JOIN inserted i ON i.ID = n.ID
    PRINT 'Number of rows updated: ' + CONVERT(VARCHAR,@@rowcount)
 
    PRINT 'TRIGGER tiu_Numbers: Updating Numbers.PreviousNumber from deleted'
    UPDATE Numbers 
    SET PreviousNumber = d.Number
    from Numbers n
    JOIN deleted d ON d.ID = n.ID
    PRINT 'Number of rows updated: ' + CONVERT(VARCHAR,@@rowcount)
END
GO
 
INSERT Numbers (ID, Number) VALUES (1, 8), (2, 10), (3, 12)
GO

-- Output is: 

TRIGGER tiu_Numbers: Updating Numbers.CurrentNumber from inserted
 
(3 row(s) affected)
Number of rows updated: 3
TRIGGER tiu_Numbers: Updating Numbers.PreviousNumber from deleted
 
(0 row(s) affected)
Number of rows updated: 0
 
(3 row(s) affected)

-- End output

SELECT * FROM Numbers
GO

-- Output is: 
 
ID          Number      CurrentNumber PreviousNumber
----------- ----------- ------------- --------------
1           8           8             NULL
2           10          10            NULL
3           12          12            NULL
 
(3 row(s) affected)

-- End output

UPDATE Numbers SET Number = Number + 2
GO
 
-- Output is:
 
TRIGGER tiu_Numbers: Updating Numbers.CurrentNumber from inserted
 
(3 row(s) affected)
Number of rows updated: 3
TRIGGER tiu_Numbers: Updating Numbers.PreviousNumber from deleted
 
(3 row(s) affected)
Number of rows updated: 3
 
(3 row(s) affected)

-- End output
 
SELECT * FROM Numbers
GO

-- Output is: 

ID          Number      CurrentNumber PreviousNumber
----------- ----------- ------------- --------------
1           10          10            8
2           12          12            10
3           14          14            12
 
(3 row(s) affected)

-- End output

As demonstrated above, set based operations on inserted or deleted cover all rows affected. It is even the case that using cursors is usually a bad choice as they are usually slower in execution compared to set operations (i.e. joins). To implement the above example using cursors one would even need to implement two (one looping over inserted, one looping over deleted) making the code more then twice as long and slower in execution.

There are however scenarios where a cursor is a valid option to be used in cases where a set based operation is not possible or a processing row by row is desired. Some examples would be:

Invoke a stored procedure on each row affected.
Call a SQL command (e.g. PRINT, SEND) and use values from affected rows as command argument.
Executing administrative tasks (e.g. dump database, update statistics)
A set based operation is not possible due to the complexity of the operation(s) being applied

The next example shows such a situation where a PRINT statement is used:

SQL

--Simple Print cursor
CREATE TABLE PrintNumbers(
    ID int NOT NULL,
    Number int NOT NULL,
    CONSTRAINT pk_Numbers PRIMARY KEY (ID)
)
GO

DROP TRIGGER tiu_PrintNumbers
GO

CREATE TRIGGER tiu_PrintNumbers ON PrintNumbers FOR INSERT, UPDATE
AS
BEGIN
    PRINT 'TRIGGER tiu_PrintNumbers: Declaring cursor'
    DECLARE cursor_PrintNumbers CURSOR
    FOR SELECT ID, Number FROM inserted

    DECLARE @id int, @number int

    PRINT 'Opening cursor'
    OPEN cursor_PrintNumbers

    PRINT 'Fetch first row'
    FETCH NEXT FROM cursor_PrintNumbers INTO @id, @number
    WHILE (@@FETCH_STATUS = 0)
    BEGIN
        PRINT 'Number is: ' + CONVERT(varchar, @number)
        FETCH NEXT FROM cursor_PrintNumbers INTO @id, @number
    END

    PRINT 'Closing cursor and waiting'
    CLOSE cursor_PrintNumbers

    PRINT 'Deallocating cursor'
    DEALLOCATE cursor_PrintNumbers
END
GO

INSERT PrintNumbers (ID, Number) VALUES (1, 8), (2, 10), (3, 12)
GO

-- Output is: 

TRIGGER tiu_PrintNumbers: Declaring cursor
Opening cursor
Fetch first row
Number is: 12
Number is: 10
Number is: 8
Closing cursor and waiting
Deallocating cursor

(3 row(s) affected)

-- End output

In the above example a cursor is required as the PRINT statement cannot directly be executed as part of a set based operation.

While cursors are recommended not to be used in general they have some properties that make them interesting for asynchronous processing.

Asynchronous Triggers

To ensure consistency, RDBMS usually do not support asynchronous execution of triggers natively. Asynchronous processing however can be implemented either by marking the rows affected by the trigger (e.g. by using a clean/dirty flag) or have the trigger copy them to a separate table for later processing. In both cases another process is required to regularly collect the rows to be processed.

Asynchronous execution of triggers is worth considering in cases where it is more important for the statement to come back rather than waiting for the trigger completing the work it is supposed to do.

This could be the case:

when an immediate execution is not necessary (e.g. messaging)
when the execution of the operations in the trigger would take too long (e.g. complex, time consuming operations to be carried out by the trigger)
when the trigger overhead would not lead to efficient processing (e.g. cursor over only a small number of rows)

There are however risks to be considered when using asynchronous execution:

The row queued for processing might be changed between the time it was queued and the time it is actually picked up by the collection process
In the time span between reading a row for further processing and writing the result, the original row might have changed.
Data might be inconsistent or incomplete till asynchronous processing has completed.

To address the last point, extra care needs to be taken when writing SQL statements accessing data which is processed asynchronously. To address the first two points, cursor come nicely into play. Other than in regular triggers where cursors are a usually a bad option they come with some interesting features that can be enabled when declaring the cursor which make them first choice for asynchronous processing. The declare statement is as per below:

SQL

DECLARE cursor_name CURSOR [ LOCAL | GLOBAL ] 
     [ FORWARD_ONLY | SCROLL ] 
     [ STATIC | KEYSET | DYNAMIC | FAST_FORWARD ] 
     [ READ_ONLY | SCROLL_LOCKS | OPTIMISTIC ] 
     [ TYPE_WARNING ] 
     FOR select_statement 
     [ FOR UPDATE [ OF column_name [ ,...n ] ] ]
[;]

(see http://technet.microsoft.com/en-us/library/ms180169%28v=sql.105%29.aspx)

From the above the options of interest are:

KEYSET: When a row is fetched the actual row is selected from the base table using the key column of the base table. This ensures that always the latest values are fetched. The rows in scope are however defined when the cursor is opened. Subsequently inserted rows are missed.
DYNAMIC: The select is carried out each time a row is fetched. This also ensures that the latest values of the row are fetched. DYNAMIC is slower than the KEYSET option but rrecognizes subsequently inserted rows.
SCROLL_LOCKS: Rows are locked exclusively when selected preventing changes from others. When used with DYNAMIC only the row currently fetched is locked. When used with KEYSET all rows are locked when cursor is opened.
OPTIMISTIC: Before writing, using UPDATE WHERE CURRENT, a check is carried out if the row has changed since it has been fetched. Write attempts to changed rows fail and appropriate action can be taken.

These features can also be implemented manually but are less efficient compared to using the already built-in ones.

Examples

In a first example a random number should be generated for each table row till the generated random number is within an approximation specified.

SQL

--Repetitively generate random number for each row in table Random

CREATE TABLE Random(
    ID int NOT NULL,
    Number int NOT NULL CHECK (Number BETWEEN 0 AND 99),
    Tolerance int NOT NULL CHECK (Tolerance BETWEEN 0 AND 100),
    Approximation int NULL,
    Iteration int NULL,
    DirtyFlag CHAR(1) NOT NULL DEFAULT 'D' CHECK (DirtyFlag in ('C','D')),
    CONSTRAINT pk_Random PRIMARY KEY (ID)
)
GO

--Trigger to mark rows for later processing
CREATE TRIGGER tiu_Random ON Random FOR INSERT, UPDATE
AS
BEGIN
    IF NOT UPDATE (DirtyFlag)
    BEGIN
        PRINT 'Setting dirty flag.'
        UPDATE Random
        SET DirtyFlag = 'D'
        FROM Random r 
        JOIN inserted i ON i.ID = r.ID
        WHERE r.DirtyFlag <> 'D'
    END
END
GO

--Procdure to process rows in Random and updating Approximation, Iteration, DirtyFlag
CREATE PROCEDURE pr_processRandom
AS
BEGIN
    PRINT 'Declaring cursor'
    DECLARE cursor_Random CURSOR SCROLL KEYSET OPTIMISTIC
    FOR SELECT Number, Tolerance FROM Random WHERE DirtyFlag = 'D'
    FOR UPDATE OF Approximation, Iteration, DirtyFlag

    DECLARE @number int, @tolerance int, @approximation int, @iteration int, @match int

    PRINT 'Opening cursor'
    OPEN cursor_Random

    PRINT 'Fetch first row'
    FETCH NEXT FROM cursor_Random INTO @number, @tolerance

    PRINT 'Start looping'
    WHILE (@@FETCH_STATUS = 0)
    BEGIN

        SELECT @iteration = 0
        SELECT @match = 0

        PRINT 'Processing'
        WHILE (@match = 0)
        BEGIN
            SELECT @iteration = @iteration + 1
            SELECT @approximation = CONVERT(INT, RAND() * 100)
            IF @tolerance > 0
            BEGIN
                IF @approximation = @number SELECT @match = 1
            END
            ELSE 
            BEGIN
                IF @number BETWEEN (@approximation - (@number * @tolerance/100)) 
                    AND (@approximation + (@number * @tolerance/100)) 
                SELECT @match = 1 
            END
        END

        PRINT 'Update current row'
        UPDATE Random
        SET Approximation = @approximation, Iteration = @iteration, DirtyFlag = 'C'
        WHERE CURRENT OF cursor_Random

        IF @@ERROR = 16947
        BEGIN
            PRINT 'Refetching current row as it was changed outside of this cursor'
            FETCH RELATIVE 0 FROM cursor_Random INTO @number, @tolerance
        END
        ELSE
        BEGIN
            PRINT 'Fetch next row'
            FETCH NEXT FROM cursor_Random INTO @number, @tolerance
        END
    END

    PRINT 'Closing cursor'
    CLOSE cursor_Random

    PRINT 'Deallocating cursor'
    DEALLOCATE cursor_Random
END
GO

INSERT Random (ID, Number, Tolerance) VALUES (1, 8, 10), (2, 10, 20), (3, 80, 5)
GO

EXEC pr_processRandom
GO

-- Output:

Declaring cursor
Opening cursor
Fetch first row
Start looping
Processing
Update current row

(1 row(s) affected)
Fetch next row
Processing
Update current row

(1 row(s) affected)
Fetch next row
Processing
Update current row

(1 row(s) affected)
Fetch next row
Closing cursor
Deallocating cursor

-- End output

SELECT *
FROM Random
GO

-- Output:

ID          Number      Tolerance   Approximation Iteration   DirtyFlag
----------- ----------- ----------- ------------- ----------- ---------
1           8           10          8             8           C
2           10          20          10            208         C
3           80          5           80            68          C

(3 row(s) affected)

-- End output

The procedure can be scheduled to run regularly using:

An external scheduler
The MS SQL Server built-in scheduler (not available in the MS SQL Server Express version)
The system stored procedure sp_procoption

The below snippet shows how pr_processRandom can be scheduled to run using sp_procoption upon startup of the MS SQL server instance.

SQL

USE master
GO
--Wrapper procedure BackgroundTask to call pr_processRandom with a 1 minute waiting period
CREATE PROCEDURE BackgroundTask
AS
BEGIN
    WHILE OBJECT_ID ('sandbox..pr_processRandom') IS NOT NULL
    BEGIN
        EXEC sandbox..pr_processRandom
        WAITFOR DELAY '00:01' 
    END
END
GO
--Schedule BackgroundTask to be started on startup
sp_procoption   @ProcName = 'BackgroundTask',
                @OptionName = 'startup',
                @OptionValue = 'on'
GO

Points of Interest

See the below URLs:

http://technet.microsoft.com/en-us/library/ms190211%28v=sql.105%29.aspx for cursor locking,

http://sqlblog.com/blogs/hugo_kornelis/archive/2007/11/21/curious-cursor-optimization-options.aspx for cursor performance tests,

http://msdn.microsoft.com/en-us/library/windows/desktop/ms675119%28v=vs.85%29.aspx for cursor types,

http://social.msdn.microsoft.com/Forums/sqlserver/en-US/0d1bc3cd-22d6-4b8f-bf87-ecc0121ac85b/what-is-the-difference-between-static-cursordynamic-cursor-and-keyset-cursor?forum=transactsql for a quick overview of the different cursor types.

History

November 2013: Initial version.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)