Utilities for STL std::string
Lots of programmers have been familiar with various routines for string objects, such as length
, substring
,
find
, charAt
, toLowerCase
, toUpperCase
, trim
, equalsIgnoreCase
, startsWith
,
endsWith
, parseInt
, toString
, split
, and so on.
Now, if you are using STL and its string class std::string
, how do you do something which the above routines do?
Of course, std::string
supplies some methods to implement some of the routines above. They are:
length()
: get the length of the string.
substr()
: get a substring of the string.
at()
/operator []
: get the char at the specified location in the string.
find
/rfind()
: search a string in a forward/backward direction for a substring.
find_first_of()
: find the first character that is any of the specified characters.
find_first_not_of()
: find the first character that is not any of the specified characters.
find_last_of()
: find the last character that is any of the specified characters.
find_last_not_of()
: find the last character that is not any of the specified characters.
Please refer to the document for more std::string
methods.
Some routines are not implemented as std::string
methods, but we can find
a way in algorithm.h to do that. Of course, the existing methods of std::string
are also used to implement them.
Transform a string to upper/lower case
std::transform(str.begin(), str.end(), str.begin(), tolower);
std::transform(str.begin(), str.end(), str.begin(), toupper);
Please refer to the document for details of the std::transform
function.
Trim spaces beside a string
Trim left spaces
string::iterator i;
for (i = str.begin(); i != str.end(); i++) {
if (!isspace(*i)) {
break;
}
}
if (i == str.end()) {
str.clear();
} else {
str.erase(str.begin(), i);
}
Trim right spaces
string::iterator i;
for (i = str.end() - 1; ;i--) {
if (!isspace(*i)) {
str.erase(i + 1, str.end());
break;
}
if (i == str.begin()) {
str.clear();
break;
}
}
Trim two-sided spaces
Trim left spaces then trim right spaces. Thus two-sided spaces are trimmed.
Create string by repeating a character or substring
If you want create a string by repeating a substring, you must use a loop to implement it.
string repeat(const string& str, int n) {
string s;
for (int i = 0; i < n; i++) {
s += str;
}
return s;
}
But if you need to just repeat a character, std::string
has a constructor.
string repeat(char c, int n) {
return string(n, c);
}
Compare ignore case
It's funny. We should copy the two strings which we are attempting to compare. Then transform all of it to lower case. At last, just compare the two lower case strings.
StartsWith and EndsWith
StartsWith
str.find(substr) == 0;
If the result is true
, str
starts with substr
.
EndsWith
size_t i = str.rfind(substr);
return (i != string::npos) && (i == (str.length() - substr.length()));
If result is true
, str
ends with substr
.
There is another way to do that. Just get the left substring or right substring to compare. Because I don't want to calculate
if the string's length is enough, I use find
and rfind
to do that.
Parse number/bool from a string
For these routines, atoi
, atol
, and some other C functions are OK. But I want
to use the C++ way to do it. So I choose std::istringstream
. The class is in sstream.h.
A template function can do most, excluding bool values.
template<class T> parseString(const std::string& str) {
T value;
std::istringstream iss(str);
iss >> value;
return value;
}
The template function can parse 0 as false
and other numbers as true
. But it cannot parse "false"
as false
and "true"
as true
. So I wrote a special function.
template<bool>
bool parseString(const std::string& str) {
bool value;
std::istringstream iss(str);
iss >> boolalpha >> value;
return value;
}
As you saw, I pass a std::boolalpha
flag to the input stream, then the input stream can recognize the literal bool
value.
It is possible to use a similar way to parse a hex string. This time I should pass a std::hex
flag to the stream.
template<class T> parseHexString(const std::string& str) {
T value;
std::istringstream iss(str);
iss >> hex >> value;
return value;
}
To string routines
Like parsing from string, I will use std::ostringstream
to get a string from other kinds of values. The class is also
in sstream.h. The relative three functions are shown here.
template<class T> std::string toString(const T& value) {
std::ostringstream oss;
oss << value;
return oss.str();
}
string toString(const bool& value) {
ostringstream oss;
oss << boolalpha << value;
return oss.str();
}
template<class T> std::string toHexString(const T& value, int width) {
std::ostringstream oss;
oss << hex;
if (width > 0) {
oss << setw(width)
<< setfill('0');
}
oss << value;
return oss.str();
}
Did you take note of setw
and setfill
? They are still flags which need an argument. std::setw
allows the output thing
in the stream to occupy a fixed width. If its length is not enough, by default it uses spaces to fill. std::setfill
is used to change the space holder.
If you want to control the alignment, there are the std::left
and std::right
flags.
Oh, I forgot to tell you, setw
and setfill
need the
iomanip.h header file.
Split and tokenizer
I think the split function should be implemented with a tokenizer. So I wrote a tokenizer first. We can use the find_first_of
and find_first_not_of
methods to get
each token. Shown below is the nextToken
method of the Tokenizer
class.
bool Tokenizer::nextToken(const std::string& delimiters) {
size_t i = m_String.find_first_not_of(delimiters, m_Offset);
if (i == string::npos) {
m_Offset = m_String.length();
return false;
}
size_t j = m_String.find_first_of(delimiters, i);
if (j == string::npos) {
m_Token = m_String.substr(i);
m_Offset = m_String.length();
return true;
}
m_Token = m_String.substr(i, j - i);
m_Offset = j;
return true;
}
The complete tokenizer is available in the source code archive. You can download it from the link above. All other functions are still in the source code files.