Quick summary
mse::TRegisteredPointer is a smart pointer that behaves just like a raw pointer, except that its value is automatically set to null_ptr when the target object is destroyed. It can be used as a general replacement for raw pointers in most situations. Like a raw pointer, it does not have any intrinsic thread safety. But in exchange it has no problem targeting objects allocated on the stack (and obtaining the corresponding performance benefit). With default run-time checks enabled, this pointer is safe from accessing invalid memory.
mse::TRegisteredFixedPointer is a derivative of mse::TRegisteredPointer that is a functional equivalent of a C++ reference. That is, it may only be constructed to point at an existing object and cannot be retargeted after construction. While these properties may make it unlikely that a C++ reference will end up being used to access invalid memory, it is of course, not impossible. mse::TRegisteredFixedPointer on the other hand, inherits mse::TRegisteredPointer's safety with respect to invalid memory access.
Who should use registered pointers?
Registered pointers are appropriate for use by two groups of C++ developers - those for whom safety and security are critical, and also everybody else.
Registered pointers can help eliminate many of the opportunities for inadvertently accessing invalid memory.
While using registered pointers can incur a modest performance cost, because the registered pointers have the same behavior as raw pointers when pointing to valid objects, they can be "disabled" (automatically replaced with the corresponding raw pointer) with a compile-time directive, allowing them to be used to help catch bugs in debug/test/beta modes while incurring no overhead cost in release mode. So there is really no excuse for not using them.
Usage
Using registered pointers is easy. Just copy two files, mseprimitives.h and mseregistered.h, into your project (or "include" directory). There are no other dependencies. Registered pointer usage is very similar to raw pointer usage and they can generally be used as a "drop-in" substitute. Note that the target object does have to be declared as a "registered object". Because the registered object type is publicly derived from the original object's type, it remains compatible with it.
#include "mseregistered.h"
...
class A {
public:
int b = 3;
};
A a;
mse::TRegisteredObj<A> registered_a;
A* A_native_ptr1 = &a;
mse::TRegisteredPointer<A> A_registered_ptr1 = ®istered_a;
A* A_native_ptr2 = new A();
mse::TRegisteredPointer<A> A_registered_ptr2 = mse::registered_new<A>();
delete A_native_ptr2;
mse::registered_delete<A>(A_registered_ptr2);
If you prefer to do less typing, shorter aliases are available:
#include "mseregistered.h"
using namespace mse;
...
class A {
public:
int b = 3;
};
ro<A> registered_a;
rp<A> A_registered_ptr1 = ®istered_a;
rp<A> A_registered_ptr2 = rnew<A>();
rdelete<A>(A_registered_ptr2);
The example project included with this article contains a comprehensive set of examples of registered pointers in action.
These days C++ stands out as a uniquely dangerous language. At least compared to the other modern languages. By "dangerous", I mean the ever-present significant possibility of accessing invalid memory. The potential consequences of invalid memory access can be severe. From exposure of sensitive data to complete compromise of the run-time environment.
Presumably this is the main reason C++ is not a popular language for (server side) web applications. Yet curiously, it is still the language used for critical parts of the web infrastructure. Web servers and web browsers, for example. Why is that? I suggest that it's simply because no other language is really up to the job. One issue in particular is that a lot of the other languages depend on garbage collection to achieve their language safety, which is arguably not appropriate for writing complex systems that need to be reliably responsive.
But C++ is still dangerous, and there have been countless security exploits that have taken advantage of that.
Since C++11, C++ has become a much more powerful language. Is there really still no practical way to avoid using C++'s dangerous elements? Well let's consider the most dangerous element of all, the pointer. Experienced (older) C++ programmers know how easy it can be to unintentionally end up with a pointer pointing to invalid memory. The situation is better now that the STL provides well-tested versions of many of the commonly used dynamic data structures so you don't have to implement your own, eliminating much of the need to use pointers at all.
And when using dynamic allocation, std::shared_ptr can often be a great substitute for raw pointers that helps ensure you don't accidentally deallocate the target object prematurely. Using std::shared_ptr essentially gets you the safety benefits of garbage collection, but, like garbage collection, there is a performance cost. In my opinion the safety benefit is worth it in pretty much all situations, but others would disagree.
The popular position in the C++ community seems to be that it is still appropriate to use raw pointers in situations where the user does not participate in the ownership (i.e. scheduling of the destruction) of the target object. More astute programmers add the condition that you must be sure that the target object will outlive the pointer. The problem is that this condition is easy to get wrong. Consider this example:
#include <vector>
class CNames : public std::vector<std::string> {
public:
void addName(const std::string& name) {
(*this).push_back(name);
}
};
class CQuarantineInfo {
public:
void add_quarantine_patient(const std::string* p_patient_name) {
if (p_patient_name) {
if ((3 * supervising_doctors.size()) <= quarantined_patients.size()) {
if (1 <= available_reserve_doctors.size()) {
supervising_doctors.addName(available_reserve_doctors.back());
supervising_doctors.shrink_to_fit();
available_reserve_doctors.pop_back();
}
}
quarantined_patients.addName(*p_patient_name);
}
}
CNames quarantined_patients;
CNames supervising_doctors;
CNames available_reserve_doctors;
};
void main(int argc, char* argv[]) {
CQuarantineInfo quarantine_info;
quarantine_info.available_reserve_doctors.addName("Dr. Bob");
quarantine_info.available_reserve_doctors.addName("Dr. Dan");
quarantine_info.available_reserve_doctors.addName("Dr. Jane");
quarantine_info.available_reserve_doctors.addName("Dr. Tim");
quarantine_info.add_quarantine_patient(&std::string("Amy"));
quarantine_info.add_quarantine_patient(&std::string("Carl"));
quarantine_info.add_quarantine_patient(&std::string("Earl"));
const std::string* p_name_of_doctor_that_contracted_the_infection = &(quarantine_info.supervising_doctors.front());
quarantine_info.add_quarantine_patient(p_name_of_doctor_that_contracted_the_infection);
}
It may never have occurred to the author of the add_quarantine_patient() function that the reference to the new patient could also be a reference to a supervising doctor, in which case the function can inadvertently cause the target of its p_patient_name parameter to be invalidated before it's finished using it.
It's a contrived example, but this kind of thing can easily happen in more complex situations. Of course using raw pointers is perfectly safe in the vast majority of cases. The problem is that there are a minority of cases where it's easy to assume that it's safe when it really isn't. So the prudent policy is to simply not use raw pointers (unless you're going to do some very thorough testing).
Again, using std::shared_ptr in place of raw pointers everywhere would be a simple way to solve the problem, but with a performance cost. A lot of that performance cost comes from the constraint that std::shared_ptr target objects cannot (or should not) be allocated on the stack. So when considering performance, registered pointers can often be a better alternative.
Here's what the above example looks like when substituting raw pointers (and references) with registered pointers:
#include <vector>
#include "mseregistered.h"
using namespace mse;
class CNames : public std::vector<ro<std::string>> {
public:
void addName(rfcp<std::string> p_name) {
(*this).push_back(*p_name);
}
};
class CQuarantineInfo {
public:
void add_quarantine_patient(rcp<std::string> p_patient_name) {
if (p_patient_name) {
if ((3 * supervising_doctors.size()) <= quarantined_patients.size()) {
if (1 <= available_reserve_doctors.size()) {
supervising_doctors.addName(&available_reserve_doctors.back());
supervising_doctors.shrink_to_fit();
available_reserve_doctors.pop_back();
}
}
quarantined_patients.addName(&*p_patient_name);
}
}
CNames quarantined_patients;
CNames supervising_doctors;
CNames available_reserve_doctors;
};
void main(int argc, char* argv[]) {
CQuarantineInfo quarantine_info;
quarantine_info.available_reserve_doctors.addName(&ro<std::string>("Dr. Bob"));
quarantine_info.available_reserve_doctors.addName(&ro<std::string>("Dr. Dan"));
quarantine_info.available_reserve_doctors.addName(&ro<std::string>("Dr. Jane"));
quarantine_info.available_reserve_doctors.addName(&ro<std::string>("Dr. Tim"));
quarantine_info.add_quarantine_patient(&ro<std::string>("Amy"));
quarantine_info.add_quarantine_patient(&ro<std::string>("Carl"));
quarantine_info.add_quarantine_patient(&ro<std::string>("Earl"));
rcp<std::string> p_name_of_doctor_that_contracted_the_infection = &(quarantine_info.supervising_doctors.front());
try {
quarantine_info.add_quarantine_patient(p_name_of_doctor_that_contracted_the_infection);
}
catch (...) {
}
ro<std::string> patient_fred("Fred");
quarantine_info.add_quarantine_patient(&patient_fred);
}
By default, registered pointers will throw an exception on any attempt to access invalid memory.
So there you go, C++'s most dangerous element made safe. Without sacrificing the performance benefit of stack allocation. Used along with the rest of the "SaferCPlusPlus" library, it is now practical to write C++ code with greatly reduced risk of accessing invalid memory.
Before we finish up, every good data type plugging article needs a benchmark chart:
Pointer Type | Time |
mse::TRegisteredPointer (stack) | 0.027 seconds |
native pointer (heap) | 0.049 seconds |
mse::TRegisteredPointer (heap) | 0.074 seconds |
std::shared_ptr (heap) | 0.087 seconds |
So as we can see, mse::TRegisteredPointers targeting stack allocated objects easily outperform even native (aka raw) pointers targeting heap allocated objects.
That's it. Let's code safely out there.