Click here to Skip to main content
16,011,804 members
Please Sign up or sign in to vote.
4.00/5 (1 vote)
See more:
How can I data scrub a collection of data?
I am working with existing VB.NET code for a Windows Application that uses StreamWriter and Serializer to output an XML document of transaction data. Code below.

Private TransactionFile As ProjectSchema.TransactionFile
Dim Serializer As New Xml.Serialization.XmlSerializer(GetType(ProjectSchema.TransactionFile))
Dim Writer As TextWriter
Dim FilePath As String
Writer = New StreamWriter(FilePath)
Serializer.Serialize(Writer, TransactionFile)
Writer.Close()


The XML document is being uploaded to another application that does not accept "crlf".

The "TransactionFile" is a collection of data in a Class named ProjectSchema.TransactionFile. It contains various data types.
There are 5 functions to create nodes that contribute to the creation of a Master Transaction file named TransactionFile

I need to find CRLF characters in the collection of data and replace the CRLF characters with a space.

I am able to replace illegal characters at the field level with:

.Name = Regex.Replace((Mid(CustomerName.Name, 1, 30)), "[^A-Za-z0-9\-/]", " ")


But I need to scrub the entire collection of data.

If I try:

TransactionFile = Regex.Replace(TransactionFile, "[^A-Za-z0-9\-/]", " ")


I get "Conversion from type 'Transaction' to type 'String' is not valid" message.
Posted
Updated 2-Jun-11 10:18am
v2
Comments
Sergey Alexandrovich Kryukov 2-Jun-11 16:38pm    
All right, but I don't see a problem. Of course TransactionFile is not a string.
Show this class, think how to parse a string into its instance...
What is you data, design code, and why it all is a problem?
--SA
gspeedtech 2-Jun-11 17:54pm    
The data is Customer information. It is periodically collected, and transfomed into an XML file. The problem is, there are CRLF characters randomly across all the data which is not acceptable to the application that uploads the XML file.

Looking for a way to edit out undesireable characters from the entire TransactionFile.
gspeedtech 3-Jun-11 16:47pm    
If I try:

Dim TransactionData as datarow
For Each TransactionData In z_Transaction
Regex.Replace(TransactionData, "vbCrLf", " ")
Next
End If

I get a "datarow cannot be converted to a String" error on the "Replace" line of code.

If I try
Dim TransactionData as String

I get a similar conversion error.

Instead of attacking the entire TransactionFile, I decided to focus on each node that contributed to the TransactionFile.

For example, I decided to parse through the values of each property in the custom class - Location

I was able to accomplish this through reflection:

VB
Public Function ScrubData(ByRef STransaction)
    Dim sTable As New ProjectSchema.Location    
    Dim property1 As String
    Dim value1 As Object

    sTable = STransaction

    For Each p As System.Reflection.PropertyInfo In sTable.GetType().GetProperties()
        If p.CanRead Then
            property1 = p.Name 'for testing to identify Property Name
            value1 = p.GetValue(sTable, Nothing)
            If value1 <> " " Then
                Regex.Replace((value1), "[^A-Za-z0-9\-/]", " ")
            End If
        End If
    Next

End Function
 
Share this answer
 
You would need to loop through each item.
Altough it is a collection, the collection is an object, but so is each item in the collection.

A simple for each loop should do the trick.
 
Share this answer
 
Comments
gspeedtech 3-Jun-11 12:00pm    
Thanks for the reply,

My problem is, according to my limited knowledge, "Replace" only works on String objects.
How do you For/Each through a collection of Objects and target only the String objects?

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)



CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900