Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles
(untagged)

CTokenizer class

0.00/5 (No votes)
21 Oct 2001 1  
A simple tokenizer class that can be used on CStrings

Introduction

Sometimes you need to use strtok in MFC projects, but it has two big limitations:

  • Can't use it to tokenize more than one string at the time.
  • It's not easy to use safely with CStrings

To overcome these limitations I wrote CTokenizer.

Usage

It's quite easy to use as the following code shows:

CTokenizer tok(_T("A-B+C*D-E"), _T("-+"));
CString cs;

while(tok.Next(cs))
    TRACE2("Token: '%s', Tail: '%s'\n", (LPCTSTR)cs, (LPCTSTR)tok.Tail());

The preceding code produces the following output:

Token: 'A', Tail: 'B+C*D-E'
Token: 'B', Tail: 'C*D-E'
Token: 'C*D', Tail: 'E'
Token: 'E', Tail: ''

As you can see, it's no rocket science! It's a very simple and handy class

To summarize:

  • Create an object of type CTokenizer, passing as arguments the string to tokenize and the delimiters
  • Call Next() until it returns false
  • If you need to change the delimiters, call SetDelimiters()
  • If you need the un-tokenized part of the string, call Tail()

Updates

  • 10/22/2001 Using std::bitset instead of std::vector. Reduces memory requirements and increases performance.
  • 10/22/2001 Fixed bug related to ASCII characters above 127. Reported by John Simpsons

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here