(untagged)

Smart Pointers to boost your code

peterchen

0.00/5 (No votes)

27 Sep 2004

A beginner's introduction to the smart pointers provided by the boost library.

Download source files - 45.3 Kb

Smart Pointers can greatly simplify C++ development. Chiefly, they provide automatic memory management close to more restrictive languages (like C# or VB), but there is much more they can do.

I already know smart pointers, but why should I use boost?

What are Smart Pointers?
The first: boost::scoped_ptr<T>
Reference counting pointers
- Important Features
Example: Using shared_ptr in containers
What you absolutely must know to use boost smart pointers correctly
Cyclic References
Using weak_ptr to break cycles
intrusive_ptr - lightweight shared pointer
scoped_array and shared_array
Installing Boost
- Note about the sample project
- VC6: the min/max tragedy
Resources

What are Smart Pointers?

The name should already give it away:

A Smart Pointer is a C++ object that acts like a pointer, but additionally deletes the object when it is no longer needed.

"No longer needed" is hard to define, since resource management in C++ is very complex. Different smart pointer implementations cover the most common scenarios. Of course, different tasks than just deleting the object can be implemented too, but these applications are beyond the scope of this tutorial.

Many libraries provide smart pointer implementations with different advantages and drawbacks. The samples here use the BOOST library, a high quality open source template library, with many submissions considered for inclusion in the next C++ standard.

Boost provides the following smart pointer implementations:

`shared_ptr<T>`	pointer to `T"` using a reference count to determine when the object is no longer needed. `shared_ptr` is the generic, most versatile smart pointer offered by boost.
`scoped_ptr<T>`	a pointer automatically deleted when it goes out of scope. No assignment possible, but no performance penalties compared to "raw" pointers
`intrusive_ptr<T>`	another reference counting pointer. It provides better performance than `shared_ptr`, but requires the type `T` to provide its own reference counting mechanism.
`weak_ptr<T>`	a weak pointer, working in conjunction with `shared_ptr` to avoid circular references
`shared_array<T>`	like `shared_ptr`, but access syntax is for an Array of `T`
`scoped_array<T>`	like `scoped_ptr`, but access syntax is for an Array of `T`

Let's start with the simplest one:

The first: boost::scoped_ptr<T>

scoped_ptr is the simplest smart pointer provided by boost. It guarantees automatic deletion when the pointer goes out of scope.

A note on the samples:

The samples use a helper class, CSample, that prints diagnostic messages when it it constructed, assigned, or destroyed. Still it might be interesting to step through with the debugger. The sample includes the required parts of boost, so no additional downloads are necessary - but please read the boost installations notes, below.

The following sample uses a scoped_ptr for automatic destruction:

Using normal pointers Using scoped_ptr



void Sample1_Plain()
{
  CSample * pSample(new CSample);

  if (!pSample->Query() )
  // just some function...

  {
    delete pSample;
    return;
  }

  pSample->Use();
  delete pSample;

}

#include "boost/smart_ptr.h"


void Sample1_ScopedPtr()
{
  boost::scoped_ptr<CSample> 
             samplePtr(new CSample);

  if (!samplePtr->Query() )
  // just some function...

    return;    




  samplePtr->Use();

}

Using "normal" pointers, we must remember to delete it at every place we exit the function. This is especially tiresome (and easily forgotten) when using exceptions. The second example uses a scoped_ptr for the same task. It automatically deletes the pointer when the function returns 8 even in the case of an exception thrown, which isn't even covered in the "raw pointer" sample!)

The advantage is obvious: in a more complex function, it's easy to forget to delete an object. scoped_ptr does it for you. Also, when dereferencing a NULL pointer, you get an assertion in debug mode.

use for	automatic deletion of local objects or class members¹, Delayed Instantiation, implementing PIMPL and RAII (see below)
not good for	element in an STL container, multiple pointers to the same object
performance:	`scoped_ptr` adds little (if any) overhead to a "plain" pointer, it performs

For this purpose, using scoped_ptr is more expressive than the (easy to misuse and more complex) std::auto_ptr: using scoped_ptr, you indicate that ownership transfer is not intended or allowed.

Reference counting pointers

Reference counting pointers track how many pointers are referring to an object, and when the last pointer to an object is destroyed, it deletes the object itself, too.

The "normal" reference counted pointer provided by boost is shared_ptr (the name indicates that multiple pointers can share the same object). Let's look at a few examples:

void Sample2_Shared()
{
  // (A) create a new CSample instance with one reference

  boost::shared_ptr<CSample> mySample(new CSample); 
  printf("The Sample now has %i references\n", mySample.use_count()); // should be 1


  // (B) assign a second pointer to it:

  boost::shared_ptr<CSample> mySample2 = mySample; // should be 2 refs by now

  printf("The Sample now has %i references\n", mySample.use_count());

  // (C) set the first pointer to NULL

  mySample.reset(); 
  printf("The Sample now has %i references\n", mySample2.use_count());  // 1


  // the object allocated in (1) is deleted automatically

  // when mySample2 goes out of scope

}

Line (A) creates a new CSample instance on the heap, and assigns the pointer to a shared_ptr, mySample. Things look like this:

Then, we assign it to a second pointer mySample2. Now, two pointers access the same data:

We reset the first pointer (equivalent to p=NULL for a raw pointer). The CSample instance is still there, since mySample2 holds a reference to it:

Only when the last reference, mySample2, goes out of scope, the CSample is destroyed with it:

Of course, this is not limited to a single CSample instance, or two pointers, or a single function. Here are some use cases for a shared_ptr.

use in containers
using the pointer-to-implementation idiom (PIMPL)
Resource-Acquisition-Is-Initialization (RAII) idiom
Separating Interface from Implementation

Note: If you never heard of PIMPL (a.k.a. handle/body) or RAII, grab a good C++ book - they are important concepts every C++ programmer should know. Smart pointers are just one way to implement them conveniently in certain cases - discussing them here would break the limits of this article.

Important Features

The boost::shared_ptr implementation has some important features that make it stand out from other implementations:

shared_ptr<T> works with an incomplete type:
When declaring or using a shared_ptr<T>, T may be an "incomplete type". E.g., you do only a forward declaration using class T;. But do not yet define how T really looks like. Only where you dereference the pointer, the compiler needs to know "everything".
shared_ptr<T> works with any type:
There are virtually no requirements towards T (such as deriving from a base class).
shared_ptr<T> supports a custom deleter
So you can store objects that need a different cleanup than delete p. For more information, see the boost documentation.
Implicit conversion:
If a type U * can be implicitly converted to T * (e.g., because T is base class of U), a shared_ptr<U> can also be converted to shared_ptr<T> implicitly.
shared_ptr is thread safe
(This is a design choice rather than an advantage, however, it is a necessity in multithreaded programs, and the overhead is low.)
Works on many platforms, proven and peer-reviewed, the usual things.

Example: Using shared_ptr in containers

Many container classes, including the STL containers, require copy operations (e.g., when inserting an existing element into a list, vector, or container). However, when this copy operations are expensive (or are even unavailable), the typical solution is to use a container of pointers:

std::vector<CMyLargeClass *> vec;
vec.push_back( new CMyLargeClass("bigString") );

However, this throws the task of memory management back to the caller. We can, however, use a shared_ptr:

typedef boost::shared_ptr<CMyLargeClass>  CMyLargeClassPtr;
std::vector<CMyLargeClassPtr> vec;
vec.push_back( CMyLargeClassPtr(new CMyLargeClass("bigString")) );

Very similar, but now, the elements get destroyed automatically when the vector is destroyed - unless, of course, there's another smart pointer still holding a reference. Let's have a look at sample 3:

void Sample3_Container()
{
  typedef boost::shared_ptr<CSample> CSamplePtr;

  // (A) create a container of CSample pointers:

  std::vector<CSamplePtr> vec;

  // (B) add three elements

  vec.push_back(CSamplePtr(new CSample));
  vec.push_back(CSamplePtr(new CSample));
  vec.push_back(CSamplePtr(new CSample));

  // (C) "keep" a pointer to the second: 

  CSamplePtr anElement = vec[1];

  // (D) destroy the vector:

  vec.clear();

  // (E) the second element still exists

  anElement->Use();
  printf("done. cleanup is automatic\n");

  // (F) anElement goes out of scope, deleting the last CSample instance

}

What you absolutely must know to use boost smart pointers correctly

A few things can go wrong with smart pointers (most prominent is an invalid reference count, which deletes the object too early, or not at all). The boost implementation promotes safety, making all "potentially dangerous" operations explicit. So, with a few rules to remember, you are safe.

There are a few rules you should (or must) follow, though:

Rule 1: Assign and keep - Assign a newly constructed instance to a smart pointer immediately, and then keep it there. The smart pointer(s) now own the object, you must not delete it manually, nor can you take it away again. This helps to not accidentally delete an object that is still referenced by a smart pointer, or end up with an invalid reference count.

Rule 2: a _ptr<T> is not a T * - more correctly, there are no implicit conversions between a T * and a smart pointer to type T.

This means:

When creating a smart pointer, you explicitly have to write ..._ptr<T> myPtr(new T)
You cannot assign a T * to a smart pointer
You cannot even write ptr=NULL. Use ptr.reset() for that.
To retrieve the raw pointer, use ptr.get(). Of course, you must not delete this pointer, or use it after the smart pointer it comes from is destroyed, reset or reassigned. Use get() only when you have to pass the pointer to a function that expects a raw pointer.
You cannot pass a T * to a function that expects a _ptr<T> directly. You have to construct a smart pointer explicitly, which also makes it clear that you transfer ownership of the raw pointer to the smart pointer. (See also Rule 3.)
There is no generic way to find the smart pointer that "holds" a given raw pointer. However, the boost: smart pointer programming techniques illustrate solutions for many common cases.

Rule 2: No circular references - If you have two objects referencing each other through a reference counting pointer, they are never deleted. boost provides weak_ptr to break such cycles (see below).

Rule 3: no temporary shared_ptr - Do not construct temporary shared_ptr to pass them to functions, always use a named (local) variable. (This makes your code safe in case of exceptions. See the boost: shared_ptr best practices for a detailed explanation.)

Cyclic References

Reference counting is a convenient resource management mechanism, it has one fundamental drawback though: cyclic references are not freed automatically, and are hard to detect by the computer. The simplest example is this:

struct CDad;
struct CChild;

typedef boost::shared_ptr<CDad>   CDadPtr;
typedef boost::shared_ptr<CChild> CChildPtr;


struct CDad : public CSample
{
   CChildPtr myBoy;
};

struct CChild : public CSample
{
  CDadPtr myDad;
};

// a "thing" that holds a smart pointer to another "thing":


CDadPtr   parent(new CDadPtr); 
CChildPtr child(new CChildPtr);

// deliberately create a circular reference:

parent->myBoy = child; 
child->myDad = dad;


// resetting one ptr...

child.reset();

parent still references the CDad object, which itself references the CChild. The whole thing looks like this:

If we now call dad.reset(), we lose all "contact" with the two objects. But this leaves both with exactly one reference, and the shared pointers see no reason to delete either of them! We have no access to them anymore, but they mutually keep themselves "alive". This is a memory leak at best; in the worst case, the objects hold even more critical resources that are not released correctly.

The problem is not solvable with a "better" shared pointer implementation (or at least, only with unacceptable overhead and restrictions). So you have to break that cycle. There are two ways:

Manually break the cycle before you release your last reference to it
When the lifetime of Dad is known to exceed the lifetime of Child, the child can use a normal (raw) pointer to Dad.
Use a boost::weak_ptr to break the cycle.

Solutions (1) and (2) are no perfect solutions, but they work with smart pointer libraries that do not offer a weak_ptr like boost does. But let's look at weak_ptr in detail:

Using weak_ptr to break cycles

Strong vs. Weak References:

A strong reference keeps the referenced object alive (i.e., as long as there is at least one strong reference to the object, it is not deleted). boost::shared_ptr acts as a strong reference. In contrast, a weak reference does not keep the object alive, it merely references it as long as it lives.

Note that a raw C++ pointer in this sense is a weak reference. However, if you have just the pointer, you have no ability to detect whether the object still lives.

boost::weak_ptr<T> is a smart pointer acting as weak reference. When you need it, you can request a strong (shared) pointer from it. (This can be NULL if the object was already deleted.) Of course, the strong pointer should be released immediately after use. In the above sample, we can decide to make one pointer weak:

struct CBetterChild : public CSample
{
  weak_ptr<CDad> myDad;

  void BringBeer()
  {
    shared_ptr<CDad> strongDad = myDad.lock(); // request a strong pointer

    if (strongDad)                      // is the object still alive?

      strongDad->SetBeer();
    // strongDad is released when it goes out of scope.

    // the object retains the weak pointer

  }
};

See the Sample 5 for more.

intrusive_ptr - lightweight shared pointer

shared_ptr offers quite some services beyond a "normal" pointer. This has a little price: the size of a shared pointer is larger than a normal pointer, and for each object held in a shared pointer, there is a tracking object holding the reference count and the deleter. In most cases, this is negligible.

intrusive_ptr provides an interesting tradeoff: it provides the "lightest possible" reference counting pointer, if the object implements the reference count itself. This isn't so bad after all, when designing your own classes to work with smart pointers; it is easy to embed the reference count in the class itself, to get less memory footprint and better performance.

To use a type T with intrusive_ptr, you need to define two functions: intrusive_ptr_add_ref and intrusive_ptr_release. The following sample shows how to do that for a custom class:

#include "boost/intrusive_ptr.hpp"


// forward declarations

class CRefCounted;


namespace boost
{
    void intrusive_ptr_add_ref(CRefCounted * p);
    void intrusive_ptr_release(CRefCounted * p);
};

// My Class

class CRefCounted
{
  private:
    long    references;
    friend void ::boost::intrusive_ptr_add_ref(CRefCounted * p);
    friend void ::boost::intrusive_ptr_release(CRefCounted * p);

  public:
    CRefCounted() : references(0) {}   // initialize references to 0

};

// class specific addref/release implementation

// the two function overloads must be in the boost namespace on most compilers:

namespace boost
{
 inline void intrusive_ptr_add_ref(CRefCounted * p)
  {
    // increment reference count of object *p

    ++(p->references);
  }



 inline void intrusive_ptr_release(CRefCounted * p)
  {
   // decrement reference count, and delete object when reference count reaches 0

   if (--(p->references) == 0)
     delete p;
  } 
} // namespace boost

This is the most simplistic (and not thread safe) implementation. However, this is such a common pattern, that it makes sense to provide a common base class for this task. Maybe another article ;)

scoped_array and shared_array

They are almost identical to scoped_ptr and shared_ptr - only they act like pointers to arrays, i.e., like pointers that were allocated using operator new[]. They provide an overloaded operator[]. Note that neither of them knows the length initially allocated.

Installing Boost

Download the current boost version from boost.org, and unzip it to a folder of your choice. The unzipped sources use the following structure (using my folders):

boost\	the actual boost sources / headers
doc\	the documentation of the current version, in HTML format
libs\	libraries (not needed for
`....`	some more odd bits and ends ("more\" has some interesting stuff)

I add this folder to the common includes of my IDE:

in VC6, this is Tools/Options, Directories tab, "Show Directories for... Include files",
in VC7, this is Tools/Options, then Projects/VC++ directories, "Show Directories for... Include files".

Since the actual headers are in the boost\ subfolder, my sources has #include "boost/smart_ptr.hpp". So everybody reading the source code knows immediately you are using boost smart pointers, not just any ones.

Note about the sample project

The sample project contains a sub folder boost\ with a selection of boost headers required. This is merely so you can download and compile the sample. You should really download the complete and most current sources (now!).

VC6: the min/max tragedy

There is a "little" problem with VC6 that makes using boost (and other libraries) a bit problematic out of the box.

The Windows header files define macros for min and max, and consequently, these respective functions are missing from the (original) STL implementation. Some Windows libraries such as MFC rely on min/max being present. Boost, however, expects min and max in the std:: namespace. To make things worse, there is no feasible min/max template that accepts different (implicitly convertible) argument types, but some libraries rely on that.

boost tries to fix that as good as they can, but sometimes you will run into problems. If this happens, here's what I do: put the following code before the first include:

#define _NOMINMAX            // disable windows.h defining min and max as macros

#include "boost/config.hpp"  // include boosts compiler-specific "fixes"

using std::min;              // makle them globally available

using std::max;

This solution (as any other) isn't without problems either, but it worked in all cases I needed it, and it's just one place to put it.

Resources

Not enough information? More Questions?

boost users mailing list
Questions about boost? That's the place to go.

Articles on Code Project:

An Introduction to boost by Andrew Walker
Designing Robust Objects with boost by Jim D'Agostino, a very interesting article: it seems to cover too much topics at once, but it excels at showing how different tools, mechanisms, paradigms, libraries, etc. work together.

Please note: While I am happy about (almost) any feedback, please do not ask boost-specific questions here. Simply put, boost experts are unlikely to find your question here (and I'm just a boost noob). Of course, if you have questions, complaints, or recommendations regarding the article or the sample project, you are welcome.

History

Sept 05, 2004: Initial version
Sept 27, 2004: Published to CodeProject
Sept 29, 2004: Minor Fixes

License

This article has no explicit license attached to it but may contain usage terms in the article text or the download files themselves. If in doubt please contact the author via the discussion board below.

A list of licenses authors might use can be found here

Smart Pointers to boost your code

Contents

Articles on Code Project:

License