Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

Simple Persistent Transactional Dictionary in VBScript

0.00/5 (No votes)
12 Jan 2010 1  
Sometimes we only have VBScript, but it would still be nice to have database like features or persistence and transactions.

Aiming for beauty through simplicity

Some time ago, I was working on a project to send emails when events happen. I took an aggressively simple approach: "Make it as simple as possible, then make it simpler."

  • Use VBScript
  • Do not use a database
  • Make it robust and restartable

Part of the solution is a persistent, semi-transactional extension of the Scripting.Dictionary object written in pure VBScript. This post explains how it works and gives the code. This approach would be very simple to implement in other languages like Python, PHP, Ruby, or even C#.

Persistence and serialization

To persist an object from memory, we need to serialize it. That means to take the spread about bits of data in memory and write them out to a file. Files are sequential bytes of data and so the process gets serialization (even though we can actually access files randomly).

One serious challenge when serializing anything from VBScript is the lack of a binary format. The easiest way to serialize and deserialize is to use text streams where a line represents a serialized datum or group of data. The easiest way to encode a group of data into a line is via some delimited character because of VBScript's high speed Join and Split operators. However, simply joining strings is no good because those strings may well have a delimiter or line end characters in them.

Escape() and Unescape() come to the rescue. These functions were originally intended to URL escape strings. URLs are in ASCII (8 bit per character) standard and only allow letters, numbers, and a few other symbols. This means that any string, including Unicode, can be escaped into a string which will never have a line end character and will never have a comma in it. So, we can escape all strings and then group data using comma separated values and put one group per line.

This piece of code demonstrates how Escape and Unescape function with Unicode characters:
' This script give the following output
' %u0100%u0200%u0400%u0800%u1000%u2000%u4000%u8000%uFFFF
' -1
' Which shows the hex escaped character String and that the String when escaped and
' the unescaped is identical to the original. IE the Escape and Unescape methods work
' for unicode.
Dim s,es
s=chrw(256) & chrw(512) & chrw(1024) & chrw(2048) & _
  chrw(4096) & chrw(8192) & chrw(16384) & chrw(32768) & chrw(65535)
es=Escape(s)
WScript.Echo es
es=Unescape(es)
WScript.Echo s=es

Persistence, restartability, and transaction logs

The aim of this development is to create a very simple way of persisting data so that:

  1. Scripts can restart 'knowing' what it did last time.
  2. If a script crashes, or there is some other failure, it can restart where it left off.
  3. Storage of data is in some format where it can be 'looked up'.
  4. We do not have to use some complex database solution like JET or MS SQL Server.

Starting with requirements 3 and 4, I chose to use a Dictionary object. These are standard in VBScript as they come with the Scripting COM object set as "Scripting.Dictionary". Dictionaries allow us to look up data by keys, and they do not require a database. Using the Escape and Unescape techniques I discussed above, we can serialize the values and keys of a Dictionary to a file. That leaves requirements 1 and 2. These, I tackled with a transaction log, coupled with rollback, roll-forward, and commit.

Typically, when a programmer sets out to store something on disk, they write code to serialize the whole of a data-structure at once. This means that if the program crashes in the process of writing the file, then the file is corrupt and none of it is any use. It also means that each time the data is changed, if that change is to be persistent, the whole object has to be serialized to disk again.

A completely different way of thinking about serialization is to store the changes to a data structure to disk. A data structure is the product of all the changes made to it. If a program stores each and every change in exactly the order they occurred, then re-running the changes will recreate the data. This is exactly how many transactional databases pull off their transactional trick. They store an out of date version of the data on disk and then the up to date information is persisted by writing out the changes. I took an even simpler approach and only stored the changes, and so only had the transaction log.

The transaction log approach is great, but it has a few drawbacks:

  1. Every time a change is made to the data, a write has to be made to disk, which is performance sapping.
  2. If several changes are part of a single restartable 'unit of work', then it is incorrect to write only some and not others. I.e., for restartability, we need to be able to choose when to persist the changes and persist them in groups.
  3. For exception handling, if an exception (error) occurs during one of these units of work, it would be nice to be able to rollback the whole unit of work.

This is where rollback, roll-forward, and commit come in. Each time a change is made to my dictionary, a record of the action required to undo the change is recorded in memory, as is a record of the action to do the change. These two records are called rollback and roll-forward. It is only when Commit is called that the roll-forward records are written to disk. If Rollback is called, then the undo actions are read in reverse order and the appropriate actions to undo all the changes in the unit of work (transaction) are taken. Because the system is not multi-threaded, I am loath to call it truly transactional. Also, there is no way to get the FileSystemObject.TextStream object to write through the disk cache, so I call my TransDictionary 'semi-transactional'.

The final issue is that after a long time and a lot of data changes, the transaction log will get rather long. Always recreating the data from the complete list of changes can become inefficient. To help with this, I added a method CreateCleanLog which writes out the minimum log file required to recreate the data. The idea being that one would write out the clean log, back up the old one, and then replace the old with the clean.

Some examples of using TransDictionary

Option Explicit
Const LogFile="C:\Logs\Log.txt"
Const CleanLogFile="C:\Logs\CleanLog.txt"

Dim transDict,rc
Set transDict = New TransDictionary
' If the log file does not exists, TransDictionary will create it
' otherwise it will read it and load the data containted into the
' dictionary
transDict.LoadLog(LogFile)
If transDict.Exists("Ran Count") Then
    rc=transDict.Item("Ran Count")
    rc=rc+1
Else
    rc=1
End If
WScript.Echo "This script had been run " & rc & " times"
' VBScript will naturally convert the number into a String
' when it is stored. This sort of thing will fail if VBScript
' does not know a good way of converting the value to a String
' in which case you will have to do this yourself
transDict.SetValue "Ran Count",rc

' Now the dictionary has the new value but it is not committed
' so we can roll it back
WScript.Echo "The dictionary has the value " & transDict.Item("Ran Count")
transDict.Rollback
WScript.Echo "After rollabck it has the value " & transDict.Item("Ran Count")
transDict.SetValue "Ran Count",rc
' This commits the change and writes it to disk
transDict.Commit
' This writes a clean log file
transDict.CreateCleanLog CleanLogFile

First run:

Output:
  The dictionary has the value 1
  After rollabck it has the value

logFile:
  R,Ran%20Count
  S,Ran%20Count,1

cleanLogFile:
  S,Ran%20Count,1

Second run:

Output:
  This script had been run 2 times
  The dictionary has the value 2
  After rollabck it has the value 1

logFile:
  S,Ran%20Count,2

cleanLogfile:
  R,Ran%20Count
  S,Ran%20Count,1
  R,Ran%20Count
  S,Ran%20Count,2

The Code

Class TransDictionary

    Dim dictionary,logFile
    Dim rollBk,rollFw,lfn,fso,clean
    
    Public Sub Class_Initialize
        Set Me.dictionary=CreateObject("Scripting.dictionary")
        Set Me.rollBk=CreateObject("Scripting.dictionary")
        Set Me.rollFw=CreateObject("Scripting.dictionary")
        Me.clean=TRUE
    End Sub
    
    ' This must be called immediately after the class
    ' is instantiated so that it can read and write its
    ' log file
    Public Sub LoadLog(logFileName)
        Me.lfn=logFileName
        Set Me.fso=CreateObject("Scripting.FileSystemObject")
        Set Me.logFile=Me.fso.OpenTextFile(Me.lfn,1,true)
        While Not Me.logFile.AtEndOfStream
            action Me.logFile.ReadLine()
        Wend
        Me.logFile.Close
        Set Me.logFile=Me.fso.OpenTextFile(Me.lfn,8,false)
    End Sub

    Private Sub Class_Terminate
        On Error Resume Next
        Me.logFile.Close
    End Sub
    
    ' This method takes the appropriate action given a row from
    ' a log file 
    Private Sub action(line)
        Dim row
        row=Split(line,",")
        If row(0)="S" Then
            internalSet Unescape(row(1)),Unescape(row(2))
        ElseIf row(0)="R" Then
            internalRemove Unescape(row(1))
        End If
    End Sub
    
    ' Adds a Remove record to the log file
    Private Sub addRemove(key)
        Me.rollBk.Add Me.rollBk.count,"S," & Escape(key) & _
                      "," & Escape(Me.dictionary.Item(key))
        Me.rollFw.Add Me.rollFw.count,"R," & Escape(key)
    End Sub
    
    ' Adds an Add record to the log file
    Private Sub addSet(key,value)
        Me.rollBk.Add Me.rollBk.count,"R," & Escape(key)
        Me.rollFw.Add Me.rollFw.count,"S," & Escape(key) & "," & Escape(value) 
    End Sub
    
    ' Sets a key,value pair in the internal dictionary. It either adds or replaces
    ' the pair according it if the key is already in the internal dictionary
    Private Sub internalSet(key,value)
        If Me.dictionary.Exists(key) Then Me.dictionary.Remove key
        Me.dictionary.Add key,value
        Me.clean=False
    End Sub
    
    ' Removes a key,value pair from the dictionary
    Private Sub internalRemove(key)
        If Me.dictionary.Exists(key) Then Me.dictionary.Remove key
        Me.clean=False
    End Sub
    
    ' This writes all the changes to the internal dictionary since the
    ' last commit to the log file.
    Public Sub Commit
        Dim i,c
        c=Me.rollFw.Count -1
        For i=0 To c
            Me.logFile.WriteLine Me.rollFw.Item(i)
            ' this is the only way to force a flush of the
            ' text stream object :(
            Me.logFile.Close
            Set Me.logFile=Me.fso.OpenTextFile(Me.lfn,8,false)
        Next 
        Me.rollFw.RemoveAll
        Me.rollBk.RemoveAll
        Me.clean=True
    End Sub
    
    ' This reverts the internal dictionary to the state it was when Commit
    ' was last called - or if Commit has never been called, to the state it
    ' was immediately after having read the log file for the first time
    Public Sub RollBack
        Dim i,c
        c=Me.rollBk.Count -1
        For i=c To 0 Step -1
            action Me.rollBk.Item(i)
        Next 
        Me.rollBk.RemoveAll
        Me.rollFw.RemoveAll
        Me.clean=true
    End Sub
    
    ' This creates a new log file which contains only records to 
    ' recreate the internal dictionary. This cannot be done unless
    ' the internal dictionary is clean (IE no changes since start
    ' or  the last commit). The resultant log file can be used as
    ' a direct replacement for the current log file and so this
    ' can be used to reduce the size and read performance hit of the
    ' log file next time the class is instantiated.
    Public Sub CreateCleanLog(newFileName)
        If Not Me.clean Then 
            Err.Raise -1,"Not Me.clean, commit or rollback first"
        End If
        Me.logFile.Close
        Dim olfn
        olfn=Me.lfn
        Me.lfn=newFileName
        Set Me.logFile=Me.fso.OpenTextFile(Me.lfn,2,true)
        Dim k
        For Each k In Me.dictionary.Keys
            addSet k,Me.dictionary.Item(k)
        Next
        Commit
        Me.logFile.Close
        Me.lfn=olfn
        Set Me.logFile=Me.fso.OpenTextFile(Me.lfn,8,false)        
    End Sub
    
    ' This method adds or replaces a key value pair in the internal
    ' dictionary. The change will not be reflected in the log file
    ' until a Commit is made. Rollback will remove the change unless
    ' a Commit is called and the change will not be persisted until a
    ' Commit is made.
    Public Sub SetValue(key,value)
        If Me.dictionary.Exists(key) Then
            addRemove key
        End If
        addSet key,value
        internalSet key,value
    End Sub

    ' This method removes all key value pairs from the internal
    ' dictionary. The change will not be reflected in the log file
    ' until a Commit is made. Rollback will remove the change unless
    ' a Commit is called and the change will not be persisted until a
    ' Commit is made.
    Public Sub RemoveAll
        Dim k
        For Each k In Me.dictionary.Keys
            Remove(k)
        Next
    End Sub
    
    ' This method removes a key value pair from the internal
    ' dictionary. The change will not be reflected in the log file
    ' until a Commit is made. Rollback will remove the change unless
    ' a Commit is called and the change will not be persisted until a
    ' Commit is made.
    Public Sub Remove(key)
        If Me.dictionary.Exists(key) Then 
            addRemove(key)
            internalRemove(key)
        End If
    End Sub
    
    ' This returns an array of all the values in the internal dictionary
    Public Function Items()
        Items=Me.dictionary.Items
    End Function
    
    ' This removes the value associated with passed key in the internal
    ' dictionary or NULL if the key is not present.
    Public Function Item(key)
        Item=Me.dictionary.Item(key)
    End Function
    
    ' This returns an array of all the keys in the internal dictionary
    Public Function Keys()
        Keys=Me.dictionary.Keys
    End Function
    
    ' This returns true if the passed key is in the internal dictionary
    ' and false otherwise.
    Public Function Exists(key)
        Exists=Me.dictionary.Exists(key)
    End Function
        
End Class

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here