Introduction
I should confess: I'm a C/C++ object-oriented guy that came to SQL Server late in my career. You dudes who had followed similar path knows what it means. Having met the standard template library, programming stored procedures is like returning to the times of Basic before Visual Basic, or Pascal before Turbo Pascal... But you know, it is not your decision to change an organization's culture. Just adapt and contribute with the best you can offer from your experience.
One of the first things annoyed me from Sql Server stored procedures was the total lack of support for using collections or list as parameters. I could only remember Perl... SELECT statement in principle returns a list so why I can not input a list too? Not to mention why a stored procedure can not capture the list generated by a called stored procedure...? I know in recent versions of Sql Server stored procedures support table type as parameters. I have not tested this feature yet. But for people like me who are stuck at Sql Serveer 2005 this article can result useful. Even if you are using a post-2005 sql server version you can happen on code handling such a list, or may be you like the quick and dirty approach of the lousely-typedness of a comma delimited list.
So I came to the common workaround of using a Varchar parameter to pass a comma-separated string of the stringized items. The problem was to parse the list. First time dealing with the parsing came fast to my mind the useful strtok() function, from the good old C times. After stumbling for the second time with the problem I decided to write my own sql version of strtok()
Some ANSI C Background
strtok() is a very useful C function from the core string library. You pass as parameters the target string and the expected separator and you get the substring up to the first appareance of the separator. It is the so called "token".
s = strtok("one,two,three", ',')
will return "one". The best part comes from the fact that subsequent calls to the function return the following tokens "two", "three", and finally null to signal there is no more tokens. This make strtok() suitable for being used inside a loop
s = strtok("one,two,three", ',');
while (s != null)
{
s = strtok(null, ',');
}
Note just only in the first call you specify the input string. In the subsequent calls you pass null to indicate to use the same string and look for the next token... It sounds a little weird. Actually strtok() modifies the input string and stores internally the state of the search sequence just to know where to start the search for the next token. I dare not to say it is a bad programming practice (Oh man, it is standard ANSI C library!). Just we will change the function signature for our sql version. Anyway we can not use local static variables nor pointer trickiness in our Transact SQL code.
Our Transact SQL version
Then our trumpeted function will look like this
CREATE FUNCTION strtok(
@sStringToParse varchar(max)
,@sSeparator char(1)
,@nPosition int Output
)
RETURNS varchar(max)
It returns a substring of @sStringToParse from @nPosition up to the first appareance of @sSeparator or the end of string if no separator is found. If @nPosition is at the end or after the end of @sStringToParse it will return null. @sStringToParse will not be modified so we can use a table column as argument without the fear of it being altered by a strtok() side effect. The @nPosition will return the position after the last separator found. For the beginning you just provide 1 as argument to start scanning from the start of string (remember in Sql string start from positio 1). All this looks very good BUT... Transact Sql functions does not accepts output parameters.
Ok. We can turn our function into a stored procedure like
CREATE PROCEDURE strtok(
@sStringToParse varchar(max)
,@sSeparator char(1)
,@sToken varchar(max) Output
,@nPosition int Output
)
But functions in Transact Sql are far more flexible than stored procedures. They have the benefit that they can be used in expressions, even as part of queries or other data manipulation operations. And remember you dudes, we are C/C++ guys just transcoding a C function, we can not convert it into a procedure, it sounds like a sin.
So lets explore another solution. Our problem is the following. We need to find a way to return two values from a Transact SQL function. Then why not to use a table valued function? It can returns one row with our two needed values
CREATE Function strtok (
@sStringToParse varchar(max)
,@sSeparator char(1)
,@nStartPosition int
)
Returns @tbResult Table (
Token Varchar(max)
,NextStartPosition int
)
As
...
Then we can use it in a loop like
Declare @sList Varchar(max)
Declare @sToken Varchar(max)
Declare @nPos Int
Set @sList = 'one, two, three'
Set @nPos = 1
Set @sToken = ''
While @sToken Is Not Null
Begin
Select @sToken = Token,
@nPos = NextStartPosition
From strtok(@sList, ',', @nPos)
End
Note, it is vital to recover NextStartingPosition from the strtok call to be it reinjected as the parameter @nStartPosition in the next call.
We can also parse a two-dimensional structure
Declare @sList Varchar(max)
Declare @sRowToken Varchar(max), @sCellToken Varchar(max)
Declare @nPosX Int, @nPosY Int
Set @sList = 'one, two, three; four, five, six; seven, eight, nine'
Set @sRowToken = ''
Set @nPosY = 1
While @sRowToken Is Not Null
Begin
Select @sRowToken = Token,
@nPosY = NextStartPosition
From strtok(@sList, ';', @nPosY)
Select @sRowToken
If @sRowToken Is Null
Break;
Set @sCellToken = ''
Set @nPosX = 1
While @sCellToken Is Not Null
Begin
Select @sCellToken = Token,
@nPosX = NextStartPosition
From strtok(@sRowToken, ',', @nPosX)
Select @sCellToken
End
End
So it looks like a workable idea. Lets provide the full source code for the Transact-Sql strtok function:
CREATE Function strtok (
@sStringToParse Varchar(max)
,@sSeparator char(1)
,@nStartPosition int
)
Returns @tbResult Table (
Token Varchar(max)
,NextStartPosition int
)
AS
Begin
Declare
@nNextPos int
,@nLenStringToParse int
,@sToken varchar(max)
Select @nLenStringToParse = Len(@sStringToParse)
If @sStringToParse Is Null
Begin
Select
@nNextPos = Null
,@sToken = Null
End
Else If @sSeparator Is Null
Begin
Select
@nNextPos = @nLenStringToParse + 1
,@sToken = Substring(@sStringToParse, @nStartPosition, @nLenStringToParse)
End
Else If @nLenStringToParse = 0
Begin
If @nStartPosition = 1
Select
@nNextPos = 2
,@sToken = @sStringToParse
Else
Select
@nNextPos = @nStartPosition + 1
,@sToken = Null
End
Else If @nStartPosition = @nLenStringToParse + 1
And Substring(@sStringToParse, @nLenStringToParse, 1) = @sSeparator
Begin
Select
@nNextPos = @nStartPosition + 1
,@sToken = ''
End
Else If @nStartPosition > @nLenStringToParse
Begin
Select
@nStartPosition = Null
,@sToken = Null
End
Else
Begin
Select @nNextPos = CharIndex(@sSeparator, @sStringToParse, @nStartPosition)
If @nNextPos Is Null
Begin
Select
@sToken = Null
,@nNextPos = Null
End
Else If @nNextPos = 0
Begin
If @nStartPosition = 0
Select @sToken = @sStringToParse
Else
Select @sToken = SUBSTRING( @sStringToParse, @nStartPosition, @nLenStringToParse )
Select @nNextPos = @nLenStringToParse + 1
End
Else
Begin
Select @sToken = SUBSTRING( @sStringToParse, @nStartPosition, @nNextPos - @nStartPosition )
Select @nNextPos = @nNextPos + 1
End
Set @nStartPosition = @nNextPos
End
Insert Into @tbResult (NextStartPosition, Token)
Values (@nNextPos, @sToken)
Return
End
Enjoy!
Conclussions
We have sucessfully created a Transact-SQL function for parsing a string containing a delimited list.
References