Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / desktop / Win32

Recursive VBA JSON Parser for Excel

4.81/5 (13 votes)
24 Oct 2014CPOL4 min read 93.7K   4.8K  
JSON parser for VBA class module, allows recursive data

Introduction

This is a VBA class that can parse recursive JSON data.

Background

I needed to parse JSON data that was highly recursive (Arrays in objects buried in other objects). I didn't find anything suitable already in VBA so I put something together. I thought the result was worthy of sharing.

The class can be used to load JSON data directly from the web, although the sample code loads data to a file first to allow the JSON data to be compared with the parsed data. Parsed data is returned in arrays.

Using the Code

The parser is passed the JSON text in a string through the "loadstring" method. The text is then parsed into two arrays; one containing keys, the other containing values. A ">" character is used to indicate a level of key recursion. Each time a "{" or "[" control character is detected, the level is increased. Each time a "}" or "]" control character is detected, the level decreases.

The following JSON string:

{
    "Category": "Famous Pets",
    "Pet": {
        "Size": "Little",
        "Type": "Lamb",
        "Coat": {
            "Fur": "Fleece",
            "Color": "White",
            "Texture": "Like snow"
        }
    }
}

Is then parsed into a set of keys and values:

Key(1) = ">Category" Value(1) = "Famous Pets"
Key(2) = ">Pet>Size" Value(2) = "Little"
Key(3) = ">Pet>Type" Value(3) = "Lamb"
Key(4) = ">Pet>Coat>Fur" Value(4) = "Fleece"
Key(5) = ">Pet>Coat>Color" Value(5) = "White"
Key(6) = ">Pet>Coat>Texture" Value(6) = "Like snow"

Once the JSON text has been parsed, a number of properties are used to read the parsed text. The number of key/value pairs is read using the "NumElements" property (in the example above, NumElements = 6). The indexed "Key" and "Value" properties then hold the parsed information. Keys and values are returned from the class as strings. Null values are returned as zero length strings.

The class also contains an "err" (status) property to give some indication as to what's going on inside the class. Err is a long value:

1 = JSON string has been successfully parsed
-1 = JSON string has not been loaded, no results are available
-2 = JSON string cannot be correctly parsed (JSON text not fully or incorrectly formed)

The following code then makes up the class:

Global variables allow the parsed JSON text to be persistent and read out through properties of the class:

VB.NET
Private strKey As Variant
Private strVal As Variant
Private intHMax As Integer
Private lngStatus As Long  

The following function initializes the class. A status variable is set to indicate that no JSON data has been parsed, this variable is then updated during the parsing of the text.

VB.NET
Private Sub Class_Initialize()
    lngStatus = -1
End Sub  

The "NumElements" property allows the user to determine the number of key/value pairs that have been extracted from the JSON text:

VB.NET
Public Property Get NumElements() As Integer
    NumElements = intHMax
End Property

The keys and values are then made available as array elements. VBA uses a variant type to pass an array value. Array elements that are 'out of bounds' are returned as a zero length string:

VB.NET
Public Property Get Key(Index As Integer) As Variant
    If Index > UBound(strKey) Or Index < LBound(strKey) Then
        Key = ""
    Else
        Key = strKey(Index)
    End If
End Property

Public Property Get Value(Index As Integer) As Variant
    If Index > UBound(strVal) Or Index < LBound(strVal) Then
        Value = ""
    Else
        Value = strVal(Index)
    End If
End Property

The heart of the class is the code that parses the JSON string. The code looks for the next available control character in the JSON text, captures the text before the control character and shortens the initial string to the remaining JSON text. The captured text is then used to populate the keys and values. The process repeats until the initial string has been reduced to nothing. The parser also allows verbose debug data sent to the immediate window by setting: blDebug = True.

VB.NET
Public Sub LoadString(JSONText As String)
'Load the JSON text into an array

    Const cLongMax = (2 ^ 31) - 1 'Maximum Value for Long type
    
    Dim lngIndex As Long
    Dim lngContLoc As Long
    Dim lngLoc As Long
    Dim lngDelimitOffset As Long
    Dim lngASize As Long
    
    Dim intNoOfChecks As Integer 'Number of different control characters in JSON
    Dim intCheck As Integer
    Dim intCtrlChr As Integer
    Dim intObJLvl As Integer
    Dim intAryElement As Integer
    Dim intLvl As Integer
    
    Dim strID As String
    Dim strChr As String
    Dim strKeyValue As String
    Dim strValue As String
    Dim strPChar As String
    Dim strFoundVal As String
    Dim strTempString As String
    Dim strAKey() As String
    Dim strAVal() As String
    Dim strALvlKey(100) As String
    
    Dim blArray As Boolean 'Flag to indicate that an array has been found
    Dim blStringArray As Boolean 'Flag to indicate that the element in the array is a string (added v1.1)
    Dim BlArrayEnd As Boolean 'Flag to indicate that the end of an array is found (added v1.1)
    Dim blValue As Boolean 'Falg to indicate that a value has been found
    Dim blKeyAndValue As Boolean 'Found a key and value pair
    Dim blDebug As Boolean
    
    'Set the flag to true if you want to see debug information
    'during the loading process
    blDebug = True
    
    On Error GoTo ErrHandler:
    
    lngASize = 10
    ReDim strAKey(lngASize)
    ReDim strAVal(lngASize)
    
    'intArrayElement = 1 'initialize value
    'initialize values
    blArray = False
    BlArrayEnd = False '(added v1.1)
    blStringArray = False '(added v1.1)
    
    'Generate a string of control characters
    'String is {[:,]}"
    strID = ""
    strID = strID & Chr(123) 'The '{' character
    strID = strID & Chr(91)  'The '[' character
    strID = strID & Chr(58)  'The ':' character
    strID = strID & Chr(44)  'The ',' character
    strID = strID & Chr(93)  'The ']' character
    strID = strID & Chr(125) 'The '}' character
    strID = strID & Chr(34)  'The '"' character
    
    intNoOfChecks = Len(strID)
    intObJLvl = 0
    lngIndex = 1 'First element in the array will be strKey(1) and strVal(1)
    
    'As we process the JSON string it becomes shorter and shorter, until
    'its all been processed
    Do While Len(JSONText) > 0

        'Set to maximum value as default
        lngContLoc = cLongMax
        
        'Find Next control character:
        'Scan the text for the closest control character
        'to the beginning of the remaining JSON text
        For intCheck = 1 To intNoOfChecks
        
       
            strChr = Mid(strID, intCheck, 1)
            lngLoc = InStr(1, JSONText, strChr, vbBinaryCompare)
        
            If (lngLoc > 0) And (lngLoc < lngContLoc) Then
                lngContLoc = lngLoc
                intCtrlChr = intCheck
                strPChar = strChr
            End If
        
        Next intCheck
        
        'When the above for next loop ends we will have found the closest control character
        'stored in intCtrlChr - an index (1 to 8) to the found character in strChr
        'stored in lngContLoc - position of the next control character
        'stored in strPChar - the closest next control character
        
        If blDebug = True Then
            Debug.Print "Parse Character: " & strPChar
        End If
        
        'A control character has been found, figure out what to do by the found character
        If lngContLoc < cLongMax Then
         'Capture the information before the control character
         strValue = Mid(JSONText, 1, lngContLoc - 1)
         'Capture everything after the control character (the remaining JSON string)
         JSONText = Mid(JSONText, lngContLoc + 1, Len(JSONText))
        Else
            'We found the end of the JSON string
            Exit Do
        End If
        
        'Found a number or boolean value or key (the comma)
        'Updated in v1.1 to handle number types in array (process value as string or number; not both)
        If (intCtrlChr = 4) Then
          If ((blValue = True) Or (blArray = True)) And (blStringArray = False) Then
            'Found a value, and we already have key
            strFoundVal = fnStringToVal(strValue)
            blKeyAndValue = True 'Set the "Key and value found" flag
          End If
          'Finding a comma resets the string found in the array
           blStringArray = False
        End If
        
        'Start of object (The "{" character)
        If intCtrlChr = 1 Then
            intObJLvl = intObJLvl + 1
            blArray = False 'An object, not an array
            blValue = False 'Need to find a key first
            If blDebug = True Then
                Debug.Print "Start of Object, Moved up to level" & intObJLvl
            End If
        End If
        
        'End of of object (The "}" character)
        If intCtrlChr = 6 Then
            'Updated in Revision 1.1
            'Numbers preceded by the "}" character
            If blValue = True Then
                'Get the found value and set a flag
                strFoundVal = fnStringToVal(strValue)
                blKeyAndValue = True 'Set the "Key and value found" flag
                'Add back a "}" character to the string so that the level can be decremented properly
                JSONText = "}" & JSONText
            Else
                'No value was found, the "}" character indicates the end of this level
                intObJLvl = intObJLvl - 1
                blValue = False 'Need to find a key first
            End If
            If blDebug = True Then
                Debug.Print "End of Object, Moved down to level" & intObJLvl
            End If
        End If
        
        'Start of array (The "[" character)
        If intCtrlChr = 2 Then
            'intObJLvl = intObJLvl + 1
            'strALvlKey(intObJLvl) = intArrayElement
            blArray = True
            blValue = True 'Next thing should be a value
            intAryElement = 1
            If blDebug = True Then
                Debug.Print "Start of Array, Moved up to level" & intObJLvl
            End If
        End If
        
        'End of of array (The "]" character)
        If intCtrlChr = 5 Then
            'Updated v1.1 parse last numeric or boolean value of an array
            If (blArray = True) And (blStringArray = False) Then
                'Get the found value and set a flag
                strFoundVal = fnStringToVal(strValue)
                blKeyAndValue = True 'Set the "Key and value found" flag
            End If
                BlArrayEnd = True 'Mark that the end of the array is found
                blArray = False
                blValue = False 'Need to find a key first
            If blDebug = True Then
                Debug.Print "End of Array, Moved down to level" & intObJLvl
            End If
        End If
        
        'Object Value start is found (The ":" character)
        If intCtrlChr = 3 Then
            blValue = True
            BlArrayEnd = False 'Added v1.1, start of an object value is not the end of an array
            If blDebug = True Then
                Debug.Print "ready to get value"
            End If
        End If
        
        'Start of a string (the quote " character)
        'Can be a key or value
        If intCtrlChr = 7 Then
        
            'The start of the key or value has been found
            'The next quote will end the key or value
            '(unless the quote has an escape character in front of it "\")
            
            lngDelimitOffset = 1
          
            Do
                'Look for the next quote character
                lngLoc = InStr(lngDelimitOffset, JSONText, Chr(34), vbBinaryCompare)
                
                'If the string is zero length "" then exit the loop
                If lngLoc = 1 Then
                    Exit Do
                End If
            
                'Check to see if there is a delimter just before the quote
                'if there is then quote is part of the string and not the end of
                'the string.
                If Mid(JSONText, lngLoc - 1, 1) = Chr(92) Then
                    ' The quote character has an escape character in front of it
                    'so this quote doesn't count.  Remove the escape character.
                    JSONText = Mid(JSONText, 1, lngLoc - 2) & Mid(JSONText, lngLoc, Len(JSONText))
                    'and move the start of the check past the delimited quote
                    lngDelimitOffset = lngLoc
                    
                    'If we have a boogered JSON string where there is no valid closing quotes
                    'the above "if" will cause an error (the MID statement will attempt to check
                    'the string starting at a position of -1) and the code will jump to the error
                    'handling section.  If this error didn't occur the do..loop would get stuck.
    
                Else
                    Exit Do
                End If
            Loop
            
            'We now have a string, find any other delimiters
            '(any delimited " characters have already been fixed)
            strTempString = fnStringFix(Mid(JSONText, 1, lngLoc - 1))
            
            If (blValue = True) Or (blArray = True) Then
                'The key has been previously found and this is the value for the key
                strFoundVal = strTempString
                blKeyAndValue = True 'Set the "Key and value found" flag
                If blArray = True Then
                    blStringArray = True 'Added v1.1, mark that the value is a string
                End If
            Else
                If lngLoc > 0 Then
                    'We've found a key
                    strALvlKey(intObJLvl) = strTempString
                    If blDebug = True Then
                        Debug.Print "Found Key:" & strALvlKey(intObJLvl) & _
                                    " for Level: " & intObJLvl
                    End If
                End If
            End If
            JSONText = Mid(JSONText, lngLoc + 1, Len(JSONText))
        End If
        
        
        'Found a key and value, move it to the array
        If blKeyAndValue = True Then
        
            If lngIndex > lngASize Then
                lngASize = lngASize + 100
                ReDim Preserve strAKey(lngASize)
                ReDim Preserve strAVal(lngASize)
            End If
        
            strAKey(lngIndex) = ""
            For intLvl = 1 To intObJLvl
                strAKey(lngIndex) = strAKey(lngIndex) & ">" & strALvlKey(intLvl)
            Next intLvl
            
            'Updated v1.1 - save last element of an array
            If (blArray = True) Or (BlArrayEnd = True) Then
                'add the array element to the key
                strAKey(lngIndex) = strAKey(lngIndex) & ">" & Trim(str(intAryElement))
                'increment the array element
                intAryElement = intAryElement + 1
                'Reset end of array flag (set again when array end is found)
                BlArrayEnd = False
            End If
            
            strAVal(lngIndex) = strFoundVal
            If blDebug = True Then
                Debug.Print "Added Key:" & strAKey(lngIndex) & _
                " Value: " & strAVal(lngIndex) & " index: " & lngIndex
            End If
            lngIndex = lngIndex + 1 'Increment the array
            blKeyAndValue = False 'Reset the "found" flag
            blValue = False 'Reset the "Value Found" flag
        End If
    DoEvents
    Loop
    
    'Number of items found
    intHMax = lngIndex - 1
    strKey = strAKey
    strVal = strAVal
    lngStatus = 1 'JSON sucessfully parsed
Exit Sub
ErrHandler:
    
    'Error handling code
    lngStatus = -2 'JSON Parse error
    'Uncomment the next line to figure out the cause of the issue
    'Debug.Print VBA.err.Number
    'Debug.Print VBA.err.Description
    'Resume
    
End Sub

The values are stripped of any non-text formatting. Values should be numbers (integer, floating point, or "null"). Values read as 'null' are further converted into a zero length string.

VB.NET
Private Function fnStringToVal(strInStr As String) As String
'Converts a string that contains formatting information into a string that only
'contains a value.  Values can be text, integer, or floating point values.
'null is passed back as a zero length string: "".

    Dim intStrPos As Integer
    Dim strTemp As String
    Dim intChar As Integer
    
    'default value
    strTemp = ""
    
    'Make sure that the string does not have a zero length
    strInStr = " " & strInStr
    
    'Loop through each character in the string and remove anything
    'that is not alphanumeric.
    For intStrPos = 1 To Len(strInStr)
        intChar = Asc(Mid(strInStr, intStrPos, 1))
        
        If ((intChar >= Asc("a")) And (intChar <= Asc("z"))) Or _
           ((intChar >= Asc("A")) And (intChar <= Asc("Z"))) Or _
           ((intChar >= Asc("0")) And (intChar <= Asc("9"))) Or _
           (intChar = Asc(".")) Or (intChar = Asc("+")) Or (intChar = Asc("-")) Then
           
           strTemp = strTemp & Chr(intChar)
        End If
    
    Next intStrPos
    
    'Values that are listed as 'null' are converted to a zero length string
    If InStr(1, "null", strTemp, vbTextCompare) > 0 Then
        strTemp = ""
    End If
    
    fnStringToVal = strTemp

End Function

Finally, JSON supports a number of escape codes. This function looks through the passed string and performs the requested escape sequence. While VBA strings support Unicode characters, other parts of Microsoft Excel are more random. Parsed text that is sent to cells or message boxes may not behave as expected and may require further processing.

VB.NET
Private Function fnStringFix(strInput As String) As String
'This function goes through a JSON string and corrects delimited characters

Dim blParseComplete As Boolean
Dim lngStartPos As Long
Dim lngCurrentPos As Long

blParseComplete = False
lngStartPos = 1

Do While blParseComplete = False
    blParseComplete = True 'If we don't find any escape sequences then allo the loop to end
    
    'Escaped sequence: replace \\ with \
    'look for the specific escape sequence
    lngCurrentPos = InStr(lngStartPos, strInput, "\\", vbTextCompare) 
    If lngCurrentPos > 0 Then
        strInput = Mid(strInput, 1, lngCurrentPos - 1) & "\" & _
                    Mid(strInput, lngCurrentPos + 2, Len(strInput))
        blParseComplete = False 'set the status to check for another escape
    End If

    'Escaped sequence: replace \/ with /
    'look for the specific escape sequence
    lngCurrentPos = InStr(lngStartPos, strInput, "\/", vbTextCompare) 
    If lngCurrentPos > 0 Then
        strInput = Mid(strInput, 1, lngCurrentPos - 1) & "/" & _
                    Mid(strInput, lngCurrentPos + 2, Len(strInput))
        blParseComplete = False 'set the status to check for another escape
    End If

    'Escaped sequence: replace \b with a backspace
    'look for the specific escape sequence
    lngCurrentPos = InStr(lngStartPos, strInput, "\b", vbTextCompare) 
    If lngCurrentPos > 0 Then
        strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(8) & _
                    Mid(strInput, lngCurrentPos + 2, Len(strInput))
        blParseComplete = False 'set the status to check for another escape
    End If
    
    'Escaped sequence: replace \f with a formfeed
    'look for the specific escape sequence
    lngCurrentPos = InStr(lngStartPos, strInput, "\f", vbTextCompare) 
    If lngCurrentPos > 0 Then
        strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(12) & _
                     Mid(strInput, lngCurrentPos + 2, Len(strInput))
        blParseComplete = False 'set the status to check for another escape
    End If

    'Escaped sequence: replace \n with a newline
    'look for the specific escape sequence
    lngCurrentPos = InStr(lngStartPos, strInput, "\n", vbTextCompare) 
    If lngCurrentPos > 0 Then
        strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(10) & _
                    Mid(strInput, lngCurrentPos + 2, Len(strInput))
        blParseComplete = False 'set the status to check for another escape
    End If

    'Escaped sequence: replace \r with a carriage return
    'look for the specific escape sequence
    lngCurrentPos = InStr(lngStartPos, strInput, "\r", vbTextCompare) 
    If lngCurrentPos > 0 Then
        strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(13) & _
                    Mid(strInput, lngCurrentPos + 2, Len(strInput))
        blParseComplete = False 'set the status to check for another escape
    End If

    'Escaped sequence: replace \t with a horizontal tab
    'look for the specific escape sequence
    lngCurrentPos = InStr(lngStartPos, strInput, "\t", vbTextCompare) 
    If lngCurrentPos > 0 Then
        strInput = Mid(strInput, 1, lngCurrentPos - 1) & Chr(9) & _
                    Mid(strInput, lngCurrentPos + 2, Len(strInput))
        blParseComplete = False 'set the status to check for another escape
    End If

    'Escaped sequence: replace \uXXXX with a unicode character
    'look for the specific escape sequence
    lngCurrentPos = InStr(lngStartPos, strInput, "\u", vbTextCompare) 
    If lngCurrentPos > 0 Then
        strInput = Mid(strInput, 1, lngCurrentPos - 1) & _
                    ChrW$(CLng("&h" & Mid(strInput, lngCurrentPos + 2, 4))) & _
                    Mid(strInput, lngCurrentPos + 6, Len(strInput))
        blParseComplete = False 'set the status to check for another escape
    End If

Loop

fnStringFix = strInput
End Function

Points of Interest

Keys for array elements are numbered when they generate a value. In the example below, the keys indicate the position of the value within the array.

{
    "prices": {
        "USD": [
            [1,"1.25"],
            [25,"1.17"],
            [50,"0.95"]
            ],
        "EUR": [
            [1,"0.98"],
            [25,"0.92"],
            [50,"0.74"]
            ]
    }
}

The JSON above is parsed into the following keys and values:

Key(1) = ">prices>USD>1" Value(1) = "1"
Key(2) = ">prices>USD>2" Value(2) = "1.25"
Key(3) = ">prices>USD>1" Value(3) = "25"
Key(4) = ">prices>USD>2" Value(4) = "1.17"
Key(5) = ">prices>USD>1" Value(5) = "50"
Key(6) = ">prices>USD>2" Value(6) = "0.95"
Key(7) = ">prices>EUR>1" Value(7) = "1"
Key(8) = ">prices>EUR>2" Value(8) = "0.98"
Key(9) = ">prices>EUR>1" Value(9) = "25"
Key(10) = ">prices>EUR>2" Value(10) = "0.92"
Key(11) = ">prices>EUR>1" Value(11) = "50"
Key(12) = ">prices>EUR>2" Value(12) = "0.74"

The downloadable demo is setup to load one of three URLs to a text file using the “URL to File” buttons. I’ve added some JSON sources as reference. The “Parse File” button then loads these text files and parses the JSON data into Sheet2 of the spreadsheet.

History

  • Rev 1.0 - Initial release
  • Rev 1.1 - Updated to correctly parse numeric values that are followed by the ']' or '}' control characters. These values are ignored in the initial release.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)