The Skeptical Methodologist

Software, Rants and Management

Smart Pointers for Dumb Developers

Edouard in his post claims that smart pointers, like shared_ptr and auto_ptr, are overused.  While he actually makes quite a few legitimate critiques, I think he misses the fundamental problem that hits nearly all of the computer scientist’s beautiful abstractions – they’re all dangerous if they’re overused.  C++’s smart pointers are no exception.  I do wish to come to their defense, though, and some what might seem like obvious guidelines for their use.

Smart pointers are the poor man’s garbage collection.  Or, in better terms, they are the average C++ developer’s garbage collection.  C++ is a language that is a little bit too ‘expert friendly’, and dynamic allocation, similar to multithreaded programming, is hard(tm).  Smart pointers in many cases allow your average developer to turn a horrible bug – memory leaks or null dereferences – into a less heinous inefficiency.  That’s a win.  They are no complete replacement for object lifecycle analysis though, which is Edouard’s main point.  Of course, raw pointers that are intelligently managed will always outperform smart pointers.  But what if you don’t have the smartest developers in the world?  Or what if you don’t have the time/money to throw your smart developers at making sure you squeeze every last FLOP out of that processor, but instead need them to focus on implementing new features, etc?  It’s all economics.

I’d like to offer my own taxonomy to budding developers.  Kind of a road map to memory usage in C++ to find a good middle ground between performance and safety.

Your first rule of thumb, and the first place you should go to build any object, is the stack.  The stack is well understood, putting objects on the stack is usually pretty cheap, and a good chunk of the time, most objects you need are transient for a few statements and then aren’t needed any  more.  For awhile there, it was simply considered more ‘OO’ to put things on the heap (thank you very much JAVA).  Thankfully, those days are gone.

What CAN’T be put on the stack?  Well, large objects run the risk of pushing the stack to its limits, since most of the time, the stack is smaller than the heap.  In addition, the stack requires a class to be concrete, which makes polymorphism a little more tricky.  Finally, again with large objects, is that passing these arguments to functions or methods can be expensive as their copy constructors are called.

For a large object that you want to put on the heap, enter the scoped pointer.  The scoped pointer acts like a stack variable in its construction and cleanup, but is created on the heap.  You should also question why you’re putting so many large objects on the heap.

To handle polymorphism, for the most part, you’re going to have to use pointers.  An abstract base class pointer can point to a derived class, so it’s a question of what kind of management you want wrapping that pointer.  Scoped pointers don’t really help any with polymorphism, but our two friends – one of which is disparaged by the Edouard – come to save us.  In the case of ‘factory’ like functions, or functions that create an object on the heap and ‘pass off’ ownership to the returnee, we’d like to use an auto_ptr, since this type of smart pointer enforces this pattern.  In the case where we’re getting a reference to an object, but the method called still wants to refer to a shared place in memory, then we usually want a shared_ptr.  The original poster is right, though, in many cases, through careful analysis, both of these pointer types can be done away with by simply studying the expected lifecycles of the object underneath.  But this is not always possible in the constraints of developing software.

Keep in mind the const reference trick, too.  Returning a reference to a stack variable is a bad idea, since the stack variable’s probably been deleted.  However, the standard states that getting a CONST REFERENCE extends the lifetime of whatever variable is being returned to the enclosing scope.  That actually opens up efficient usage of some uses of polymorphism, since the memory’s allocated on the stack yet we can still get a polymorphic reference to it.  I’d use this technique with care, though.

Ultimately, such low level control and the plethora of options available to the developer makes memory management in C++ tricky, yet powerful and abstract.  There’s different use cases for all the potential views of memory, and subtleties to each use.  Shared_ptrs can more or less cover almost all these uses, but there’s a cost associated with not doing any real analysis of your memory footprint.  For the average developer, though, if you want to make sure they don’t introduce any risks, giving them a poor man’s garbage collector is probably your best option.

(*Though you still need to peer review their code and check for cycles, the notorious corner case that shared_ptr will still leak with.)

September 1, 2009 Posted by | Uncategorized | Leave a comment