Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C++

Quick Start for C++ TR1 Regular Expressions

4.43/5 (24 votes)
26 Jun 2014BSD3 min read 1  
This article answers some of the first questions that come up when using regular expressions in C++ TR1

Introduction

Regular expression syntax is fairly similar across many environments. However, the way you use regular expressions varies greatly. For example, once you've crafted your regular expression, how do you use it to find a match or replace text? It's easy to find detailed API documentation, once you know what API to look up. Figuring out where to start is often the hardest part.

This article assumes you're familiar with regular expressions and want to work with regular expressions in C++ using the Technical Report 1 (TR1) proposed extensions to the C++ Standard Library. It's a quick start guide, briefly answering some of the first questions you're likely to ask. For more details, see Getting started with C++ TR1 regular expressions or dive into the documentation that comes with your implementation.

Quick Start Questions

Q: Where Can I Get TR1?

A: Support for TR1 extensions in Visual Studio 2008 is added as a feature pack. Other implementations include the Boost and Dinkumware. The GNU compiler gcc added support for TR1 regular expressions in version 4.3.0.

Q: What Regular Expression Flavors are Supported?

A: It depends on your implementation. Visual Studio 2008 supports these options: basic, extended, ECMAScript, awk, grep, egrep.

Q: What Header Do I Include?

A: <regex>

Q: What Namespace are Things In?

A: std::tr1

This is the namespace for the regex class and functions such as regex_search. Flags are contained in the nested namespace std::tr1::regex_constants.

Q: How Do I Do a Match?

A: Construct a regex object and pass it to regex_search.

For example:

C++
std::string str = "Hello world";
std::tr1::regex rx("ello");
assert( regex_search(str.begin(), str.end(), rx) );

The function regex_search returns true because str contains the pattern ello. Note that regex_match would return false in the example above because it tests whether the entire string matches the regular expression. regex_search behaves more like most people expect when testing for a match.

Q: How Do I Retrieve a Match?

A: Use a form of regex_search that takes a match_result object as a parameter.

For example, the following code searches for <h> tags and prints the level and tag contents.

C++
std::tr1::cmatch res;
str = "<h2>Egg prices</h2>";
std::tr1::regex rx("<h(.)>([^<]+)");
std::tr1::regex_search(str.c_str(), res, rx);
std::cout << res[1] << ". " << res[2] << "\n";

This code would print 2. Egg prices. The example uses cmatch, a typedef provided by the library for match_results<const char* cmatch>.

Q: How Do I Do a Replace?

A: Use regex_replace.

The following code will replace “world” in the string “Hello world” with “planet”. The string str2 will contain “Hello planet” and the string str will remain unchanged.

C++
std::string str = "Hello world";
std::tr1::regex rx("world");
std::string replacement = "planet";
std::string str2 = std::tr1::regex_replace(str, rx, replacement);

Note that regex_replace does not change its arguments, unlike the Perl command s/world/planet/. Note also that the third argument to regex_replace must be a string class and not a string literal.

Q: How Do I Do a Global Replace?

A: The function regex_replace does global replacements by default.

Q: How Do I Keep From Doing a Global Replace?

A: Use the format_first_only flag with regex_replace.

The fully qualified name for the flag is std::tr1::regex_constants::format_first_only and would be the fourth argument to regex_replace.

Q: How Do I Make a Regular Expression Case-insensitive?

A: Use the icase flag as a parameter to the regex constructor.

The fully qualified name of the flag is std::tr1::regex_constants::icase.

History

  • 22nd May, 2008: Initial post
  • 23rd May, 2008: Added examples

License

This article, along with any associated source code and files, is licensed under The BSD License