My boss is trying to split a comma-delimited string with Regex. He's looking for a comma followed by a blank space, or return a whole string surrounded by single or double quotes while ignoring any commas between the double or single quotes.
Here’s the regex:
string sToken = @"(?:,\s+)|(['""].+['""])(?:,\s+)";
Here’s a sample string:
var s = "1.3#, 2.99, 3\t, 4#2/2/1019#, 5, asd,, 'Howdy, Howdy, Howdy', a;sdlkf";
Results:
1.3#
2.99
3\t
4#2/2/1019#
5
asd,
'Howdy, Howdy, Howdy'
a;sdlkf
The blank line between "asd," and "'Howdy, Howdy, Howdy" is the issue. I believe I understand why it's showing up, but I don't know what regex magic I need to do to prevent it. I believe it's showing up because the regex processor finds the ", " after "asd," and splits "asd," out. It then finds another match (the "'Howdy, Howdy, Howdy'") and splits out everything between the ", " and the "'Howdy, Howdy, Howdy'" (and empty string). Note that the two commas after "asd" are not the problem. Removing one of them provides the same results except that "asd," becomes "asd" (as expected). Putting a space between the commas gives us "asd" followed by two blank lines instead of one.
I'm a rank amateur in the world of regular expressions, but my boss came to me because he had been told that I might know something about them (the squealer who told him this has been suitably punished). :)
Anyway, I would appreciate any assistance in resolving this.
To clarify, what we're looking for as output would be the following:
Desired Results:
1.3#
2.99
3\t
4#2/2/1019#
5
asd,
'Howdy, Howdy, Howdy'
a;sdlkf
The "'Howdy, Howdy, Howdy'" would be a single entry, without the blank line before it. If, however, our sample looked like this (note the space between the two commas after "asd"):
var s = "1.3#, 2.99, 3\t, 4#2/2/1019#, 5, asd, , 'Howdy, Howdy, Howdy', a;sdlkf";
Desired Results:
1.3#
2.99
3\t
4#2/2/1019#
5
asd
'Howdy, Howdy, Howdy'
a;sdlkf
(note the blank line representing the empty value between the two commas, and the lack of a comma at the end of "asd")