(untagged)

DataTable Transaction Logger

Marc Clifton

0.00/5 (No votes)

28 Feb 2006

Undo/Redo DataTable transactions by logging row changes (insert/delete) and field changes.

Introduction

This article describes the implementation of an algorithm that logs DataTable row and column changes by hooking several events provided by the DataTable, including the new TableNewRow event in the .NET 2.0 Framework. The purpose of this work is to provide local undo/redo capability for transactions generated by a UI, and also the ability to log transactions as the application itself manipulates a DataTable, suitable for serialization and application to a mirror of the table located remotely. In this implementation, each table requires its own transaction log. This will immediately cause problems in cases where the table is part of a data set - a collection of interrelated tables. For that situation, this implementation is insufficient because there is no synchronization of transactions between tables in a data set. Frankly, that problem should only be addressed after properly understanding the issues involved with working with a single DataTable.

A couple notes about the terminology - I tend to use the word "field" and "column" interchangeably. Also, I talk about "uncommitted rows". I do not, in this context, mean rows not saved to the database. I'm referring specifically to rows that haven't yet been added to the DataTable's row collection.

Design decisions

The idea of logging DataTable transactions is much easier said than done. There are several issues that had to be considered and that drove the architecture. In some cases, compromises had to be made that seemed unavoidable. The issues are:

Deleting a DataRow: Once deleted, the DataRow fields cannot be manipulated (undo/redo issue) nor added back to the owning table.
A new row can be (and typically is) initialized before being added to the table's row collection, resulting in "change" transactions before the "add" transaction.
It is impossible to determine whether the application (or a control, like the DataGrid) has decided to discard a new row.

What follows is a more detail discussion of each issue and how it affected the architecture.

Deleting a DataRow

One idea with regards to managing a list of DataTable transactions is to save the DataRow reference and the type of change. For field changes, this is simple enough - also save the column being changed and the old and new values so that both undo (apply old value) and redo (apply new value) can be used. The problem arises if the DataRow has been deleted. You can't "undo" a data row that was deleted by adding it back to the table collection, instead, you have to create a new DataRow instance. Now let's say that you have the following transaction log:

Add Row
Change Field
Change Field

And you undo all transactions. The first transaction, "Add Row" when undone, will delete the row from the DataTable. If we now attempt to redo all transactions, the deleted row has to be added again to the DataTable, but as a new instance. Now the DataRow instances in the Change Field transactions require fixing up. Are all the transactions in the log forward of the add row transaction adjusted? Yes. Originally, I had chosen to decouple the DataRow instance from the transaction by preserving the primary keys associated with each row. This put a lot of restrictions on when the transactions were logged and when the primary keys were required. I ultimately decided that fixing up the DataRow references in the transaction log was a better implementation because it decoupled the dependency on primary keys! The primary keys are useful only when serializing the transactions, which really is an autonomous function not related to the undo/redo feature. Also keep in mind that if you undo a deleted row, all the transactions referencing the deleted rows earlier in the transaction log need to be fixed up.

New row field initialization

Once the above design decision was made, it became more obvious that .NET 2.0's TableNewRow event would be very helpful in logging a "new row" transaction. Note that I'm not calling it "add row", because that is technically incorrect - the row hasn't been added to the DataTable's row collection yet. This resolved the problem of columns being initialized and changed before the row is added to the collection, as the transaction log will always have a "new row" transaction prior to the field change transaction.

New row initialization

But the story doesn't end here. It's possible for the application to simply discard a new row. For example, using the DataGrid, click on the row with the "*". This fires a TableNewRow event. Press the Escape key. The new row is discarded, but I still have a "new row" transaction in the log! Also consider an application that has created a new row and for some reason faults while initializing it. Again, the transaction log is left with a new row that it doesn't know has been discarded.

The work around to this issue has some ramifications. First, I implemented a new row garbage collection method, which deletes all log entries for rows that were never finally added to the DataTable's row collection. This method should always be called before accepting table changes (more on this later) and before performing undo operations. Essentially, the DataTable should be stable before accept and undo actions, because neither of these actions make much sense on rows that were not actually added to the DataTable's row collection. Similarly, this method should be called before serializing the transaction log.

This design decision makes the implementation more complicated because it requires managing whether or not the row has actually been added to the DataTable. This requires an "uncommitted row list". Also we can use the RowChanged event and monitor "Add" actions to remove the row from the uncommitted row list. And finally, each entry in this row list should reference each index of the transaction log that references the row. The garbage collection can then walk through the uncommitted row list, add the indices to the transaction log to a sorted buffer, then finally remove the transactions from the log by iterating through the sorted buffer from highest index to lowest (so higher indexes aren't moved when deleting a lower index). All this to prevent having to iterate through the transaction log to find transactions referencing the row being collected!

Other considerations

I was somewhat surprised to note that it is impossible to prevent the ColumnChanged event to fire, even when setting an error for the column in a ColumnChanging event handler. In fact, even setting the ProprosedValue back to the column's original value does not prevent the ColumnChanged event from firing. In the DataGrid, if you edit a column, typing in exactly the same value, the ColumnChanging event still fires. Something to note.

My original design

As I mentioned, my initial design used primary key values and was very "smart" (hahaha) about how it handled new rows and field initialization. Basically, it required all the primary keys to be valid before processing a field change. If the application first sets all the required primary keys, it would automatically add a "add row" transaction. It even had an event the application could hook to supply the primary keys. This seemed OK and actually worked quite well in my testing. As I was writing about this design, I got to this sentence: To get the transaction ordering right, a "new row" transaction is logged even though the row hasn't yet been added to the collection. At that point I stopped and realized, that was very, very wrong!

However, it points out the compromise between the two different designs - one having to deal with the new rows never been finally added to the DataTable and the other anticipating adding the row but possibly ending up with rows that actually were never added. Really, it boils down to the same thing - uncommitted rows need to be removed from the transaction log. However, the additional side effects of requiring a "get the primary keys NOW" made me settle on a different architecture. In fact, the final architecture couldn't care less about whether your DataTable even has primary keys - the only time primary keys are useful is in serializing the transaction log, and that implementation is outside of the scope of the transaction logger.

Implementation

Event handlers

TableCleared event

This event does not support undo. It clears the internal collections and resets the logging state.

TableNewRow event

In this event handler, we log the new row and add it to our collection of uncommitted rows, initializing the transaction list with the transaction index:

protected void OnTableNewRow(object sender, DataTableNewRowEventArgs e)
{
  if (doLogging)
  {
    int idx;
    DataTableTransactionRecord record;
    idx = transactions.Count;
    record = new DataTableTransactionRecord(idx, e.Row, 
                DataTableTransactionRecord.TransactionType.NewRow);
    OnTransactionAdding(new TransactionEventArgs(record));
    transactions.Add(record);
    OnTransactionAdded(new TransactionEventArgs(record));
    List<int> rowIndices = new List<int>();
    rowIndices.Add(idx);
    uncomittedRows.Add(e.Row, rowIndices);
  }
}

RowChanged event

In this event hander, when Action==DataRowAction.Add, then the row being added is removed from the uncommitted row collection:

protected void OnRowChanged(object sender, DataRowChangeEventArgs e)
{
  if (doLogging)
  {
    if (e.Action == DataRowAction.Add)
    {
      if (!uncomittedRows.ContainsKey(e.Row))
      {
        throw new DataTableTransactionException("Attempting " + 
                  "to commit a row that doesn't exist in the " + 
                  "uncommitted row collection.");
      }
      uncomittedRows.Remove(e.Row);
    }
  }
}

ColumnChanging and ColumnChanged events

This event is hooked so that we can get at the current field value and save it in the transaction as the old value. The ColumnChanged event has already changed the field value to the proposed value by the time we see it there. However, we still need the ColumnChanged event to get the new row value in case the application changed the proposed value!

if (doLogging)
{
  object oldVal = e.Row[e.Column];
  int trnIdx;
  DataTableTransactionRecord record;

  trnIdx = transactions.Count;
  record = new DataTableTransactionRecord(
                trnIdx, e.Row, e.Column.ColumnName,
                oldVal, e.ProposedValue);
  OnTransactionAdding(new TransactionEventArgs(record));
  transactions.Add(record);
  OnTransactionAdded(new TransactionEventArgs(record));
  waitingForChangedEventList.Add(record);

  if (uncomittedRows.ContainsKey(e.Row))
  {
    uncomittedRows[e.Row].Add(trnIdx);
  }
}

In the following code, the waitForChangedEventList is reverse searched for the column that is being changed. I decided to buffer the "changing" events that are waiting for completion via the ColumnChanged event since it is possible (although I certainly wouldn't recommend the practice) that the application changes other fields (possibly in other rows) in a ColumnChanging event (or in a ColumnChanged event handler that fired first):

void OnColumnChanged(object sender, DataColumnChangeEventArgs e) 
{
 if (doLogging)
 { 
   for (int i = 0; i < waitingForChangedEventList.Count; i++)
   {
      DataTableTransactionRecord r = 
                waitingForChangedEventList[i]; 
      if ( (r.ColumnName == e.Column.ColumnName) && 
                                           (r.Row == e.Row) )
      {
        r.NewValue = e.ProposedValue;
        waitingForChangedEventList.RemoveAt(i);
        break;
      }
    }
  }
}

RowDeleting event

This event handler creates a transaction for deleting a row. You cannot delete a row that isn't associated with a table, so the uncommitted collection won't ever contain this row:

protected void OnRowDeleting(object sender, 
                    DataRowChangeEventArgs e)
{
  if (doLogging)
  {
    DataTableTransactionRecord record;
    record = new DataTableTransactionRecord(transactions.Count, e.Row, 
                 DataTableTransactionRecord.TransactionType.DeleteRow);
    transactions.Add(record);
    SaveRowFields(record, e.Row);
    OnTransactionAdded(new TransactionEventArgs(record));
  }
}

Undo / Redo

The transaction logger implements Revert (a single transaction rollback) and Apply (per transaction) that can be used to implement undo/redo capability. The transaction logger does not itself provide a complete undo/redo implementation - this is left to the application, and an example follows.

Revert (undo)

The Revert method performs the following, depending on the transaction type:

For a field change, the old value is restored.
For a row that was added, the row is deleted.
For a row that was deleted, it is re-added, and all previous transactions that reference that row are fixed up to now reference the restored row:

public void Revert(int idx)
{
  // Validate idx.

  if ((idx < 0) || (idx >= transactions.Count))
  {
    throw new ArgumentOutOfRangeException("Idx cannot be " + 
        "negative or greater than the number of transactions.");
  }

  // Turn off logging, as we don't want to record 

  // the undo transactions.

  SuspendLogging();
  DataTableTransactionRecord r = transactions[idx];
  DataRow row = r.Row;

  switch (r.TransType)
  {
    // Delete the row we added.

    case DataTableTransactionRecord.TransactionType.NewRow:
      // Only save row fields if this row is first time deleted.

      if (!r.WasDeleted)
      {
        // Save all the field values for the row being deleted.

        SaveRowFields(r, row);
      }

      // Delete the row.

      row.Delete();
      break;

    // Add the row we deleted.

    case DataTableTransactionRecord.TransactionType.DeleteRow:
      DataRow newRow = sourceTable.NewRow();

      // Restore all the field values into the new row.

      RestoreRowFields(r, newRow);
      sourceTable.Rows.Add(newRow);

      // Fixup transactions referencing the deleted data row, 

      // going backwards

      for (int n = idx; n >= 0; --n)
      {
        if (transactions[n].Row == row)
        {
          transactions[n].Row = newRow;
        }
      }

      break;

    // Undo the change to field.

    case DataTableTransactionRecord.TransactionType.ChangeField:
      row[r.ColumnName] = r.OldValue;
      break;
  }

  ResumeLogging();
}

Apply

Apply (redo) performs the following, depending on the transaction type:

For a field change, the new value is restored.
For a new row, a new row is added. All later references to that row instance are fixed up to now reference the newly added row.
For a deleted row, the row is deleted.

The code is very similar to the Revert method above.

Undo/Redo example

In the demo program, the undo/redo functionality is handled by hooking the logger's TransactionAdding event and the Click events associated with the undo and redo buttons on the UI:

void OnTransactionAdding(object sender, TransactionEventArgs e)
{
  btnRedo.Enabled = false;

  if (undoRow < tlog.Log.Count - 1)
  {
    tlog.Log.RemoveRange(undoRow + 1, 
                     tlog.Log.Count - (undoRow + 1));
    tlog.AcceptChanges(); 
  }
}

void OnUndo(object sender, EventArgs e)
{
  tlog.Revert(undoRow);
  --undoRow;
}

void OnRedo(object sender, EventArgs e)
{
  ++undoRow;
  tlog.Apply(undoRow);
}

The above code has had the dgTransactions management and the button state management removed for clarity. The point of interest here is that when a transaction occurs, any redo transactions forward of the current undo transaction index are removed. Also, the AcceptChanges is called to synchronize the DataTable with the undo changes. If we don't synchronize at this point, a RejectChanges call may redo changes that had been undone.

Accept/Reject changes

A DataTable provides AcceptChanges and RejectChanges methods. There are no events associated with these methods, so instead of using them directly, you should use the ones provided by the transaction logger. The logger's methods manage an internal index that is updated in the AcceptChanges call. If a RejectChanges call is made, all transactions after this point are removed from the transaction list. This keeps the transaction logger synchronized with the DataTable's idea of what has and has not been accepted. Important: Calling RejectChanges while the undo index is not at the top of the undo stack can cause some strange effects.

The implementation illustrates how the transaction log and internal index are managed:

public void AcceptChanges()
{
  lastAcceptedChangeIndex = transactions.Count;
  sourceTable.AcceptChanges();
}

public void RejectChanges()
{
  int numTran = transactions.Count - lastAcceptedChangeIndex;
  transactions.RemoveRange(lastAcceptedChangeIndex, numTran);
  sourceTable.RejectChanges();
}

The demo program

The demo program is a small application with a MyXaml front end (what did you expect, code???) and an event handler back end that allows you to play with undo/redo functionality, accept/reject table changes, setting column and row errors (which was part of the investigation that I did to see how the DataTable events were handled) and clearing the table.

The declarative user interface

The UI is straightforward, declaring the control object graph and the checkbox data binding:

<?xml version="1.0" encoding="utf-8"?>
<!-- (c) 2006 Marc Clifton All Rights Reserved -->
<MyXaml xmlns="System.Windows.Forms, System.Windows.Forms, 
                       Version=2.0.0000.0, Culture=neutral, 
                       PublicKeyToken=b77a5c561934e089"
  xmlns:ctd="Clifton.Tools.Data, Clifton.Tools.Data"
  xmlns:def="Definition"
  xmlns:ref="Reference">
  <Form Name="DataTableUndoRedo"
    Text="DataTable Undo/Redo Demo"
    ClientSize="570, 675"
    MinimizeBox="false" 
    MaximizeBox="false"
    StartPosition="CenterScreen">
    <Controls>
      <Label Text="Edit This Table:" Location="10, 10" 
                                       Size="100, 15"/>
      <DataGrid def:Name="dgTable" Location="10, 25" 
                   Size="280, 300" DataSource="{view}">
        <TableStyles>
          <DataGridTableStyle>
            <GridColumnStyles>
              <DataGridTextBoxColumn MappingName="LastName" 
                        HeaderText="Last Name" Width="120"/>
              <DataGridTextBoxColumn MappingName="FirstName" 
                        HeaderText="First Name" Width="120"/>
            </GridColumnStyles>
          </DataGridTableStyle>
        </TableStyles>
      </DataGrid>
      <Label Text="Transactions:" Location="10, 340" Size="100, 15"/>
      <DataGrid def:Name="dgTransactions" Location="10, 355" Size="550, 300"
                 DataSource="{transactions}" ReadOnly="true"/>
      <Button def:Name="btnUndo" Location="300, 25" Size="80, 25" Text="Undo"
                 Click="{app.OnUndo}"/>
      <Button def:Name="btnRedo" Location="300, 55" Size="80, 25" Text="Redo"
                 Click="{app.OnRedo}"/>
      <Button def:Name="btnAccept" Location="400, 25" Size="100, 25" 
                 Text="Accept Changes" Click="{app.OnAcceptChanges}"/>
      <Button def:Name="btnReject" Location="400, 55" Size="100, 25" 
                 Text="Reject Changes" 
                 Click="{app.On<CODE>RejectChanges</CODE>}"/>
      <CheckBox def:Name="ckColErr" Location="300, 100" Size="200, 20" 
                 Text="Set column error on change"/>
      <CheckBox def:Name="ckRevertCol" Location="325, 120" Size="200, 20" 
                 Text="Revert field value" Enabled="{ckColErr.Checked}"/>
      <CheckBox def:Name="ckRowErr" Location="300, 140" Size="200, 20" 
                 Text="Set row error on change"/>
      <Button Location="300, 170" Size="80, 25" Text="Clear" 
                 Click="{app.OnClear}"/>
      <Button Location="300, 200" Size="80, 25" Text="Collect" 
                 Click="{app.OnCollect}"/>
    </Controls>
    <ctd:BindHelper Source="{ckColErr}" SourceProperty="Checked" 
            Destination="{app}" DestinationProperty="ColumnErrorOnChange"/>
    <ctd:BindHelper Source="{ckRowErr}" SourceProperty="Checked" 
            Destination="{app}" DestinationProperty="RowErrorOnChange"/>
    <ctd:BindHelper Source="{ckRevertCol}" SourceProperty="Enabled" 
            Destination="{ckColErr}" DestinationProperty="Checked"/>
    <ctd:BindHelper Source="{ckRevertCol}" SourceProperty="Checked" 
            Destination="{app}" DestinationProperty="RevertFieldValue"/>
  </Form>
</MyXaml>

Using the demo program, you can see how, as you manipulate the DataGrid, the transactions are added to the log. For example:

When you click on Undo, the row index in the transaction log is moved up, so that it always shows the last applied transaction. For example, clicking on undo once:

To illustrate the new row not added to the DataTable's row collection problem, we can cursor down to the row with the "*":

If we cursor back up, the row reverts to the "*" but the NewRow transaction lingers. We can go back and forth doing this any number of times:

If we click on the Collect button, all those uncommitted new row transactions are removed:

Here's an interesting exception that gets thrown when you attempt to undo an uncommitted, empty row (this is thrown by the framework):

The solution, obviously, is to perform a Collect before the Undo. However, I've specifically omitted any "smarts" in the demo code so that you can play with the demo and see how the .NET Framework interacts with the transaction logger, rather than hiding "gotchas" like this one.

Conclusion

The original concept of logging DataTable transactions seemed like a simple task. However, it was fraught with both incorrect architectural decisions and troublesome implementation. At this point, I think I've achieved the proper balance with regards to the original goal and the complexities of the DataTable in both a UI and application driven environment. I have decided to keep another aspect of the transaction logger, the ability to serialize the transactions and apply them to separate (mirrored and/or remote) DataTable, separate from this implementation. And that implementation will require primary key fields as DataRow instances will be different!

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here