Click here to Skip to main content
65,938 articles
CodeProject is changing. Read more.
Articles / Languages / C++

A Template for Polymorphs

3.67/5 (2 votes)
6 Jul 2020GPL35 min read 7.5K   71  
Registering and efficiently accessing polymorphic objects
This article describes the Registry template, which is a container for registering and efficiently accessing objects that share a common base class.

Introduction

It is often desirable to access an object using an integral identifier. If a state machine, for example, defines an identifier for each state, event, and event handler, then given a pointer to its current state and an event to be processed, it can invoke the appropriate event handler as follows:

C++
handlers_[states_[currState->id].handlerIds_[event->id]](…);

This article describes the Registry template, a container that maps integral identifiers to objects. Because it uses an array, it requires less memory and provides faster insertions and lookups than std::map, which uses a tree. I use Registry often, falling back on map when

  • the object identifiers are not integers, or
  • the array is sparse, or
  • the template needs to store actual objects rather than pointers to them, which is what Registry does.

Sometimes, I even use the template for objects that don't have fixed identifiers; the template then assigns each object the identifier of the first available slot. If the registered objects are then displayed on the console, their identifiers can be used to refer to them in CLI commands.

Background

The Registry template is taken from the Robust Services Core (RSC), an open-source framework for developing robust C++ applications. RSC contains many instances of this template, which typically serve one or more of the following purposes:

  • to look up and display documentation or other information
  • to display system capabilities
  • to track or display system resources
  • to access attributes provided by flyweights
  • to access an abstract factory that is responsible for creating application-specific objects

The framework portions of RSC are defined by the directories nb (namespace NodeBase), nw (namespace NetworkBase), and sb (namespace SessionBase). The following table lists all of the classes that define a Registry in those directories:

Registry Registrants Example of Use
AlarmRegistry Alarm display the documentation for an alarm
CliIncrement CliCommand display all commands available in a CLI increment
CliRegistry CliIncrement display all increments available in the CLI
CliText CliParm display the parameters for a CLI command
CliTextParm CliText display the strings that are options for a CLI command
DaemonRegistry Daemon recreate threads that were forced to exit
LogGroup Log display the documentation for a log
LogGroupRegistry LogGroup display the documentation for a log group
ModuleRegistry Module display all static libraries
MutexRegistry SysMutex display all mutexes
ObjectPoolRegistry ObjectPool display all object pools
PosixSignalRegistry PosixSignal access a POSIX signal's attributes
StatisticsRegistry Statistics track all statistics
StatisticsRegistry StatisticsGroup display a group of related statistics
ToolRegistry Tool display all tools
IpServiceRegistry IpService display all IP services
FactoryRegistry Factory access a factory that creates messages and state machines
InvokerPool InvokerThread track the threads that run an application
InvokerPoolRegistry InvokerPool track the thread pools that run applications
Protocol Parameter access a parameter's attributes
Protocol Signal access a signal's attributes
ProtocolRegistry Protocol access a protocol's attributes
Service EventHandler access an application's event handlers
Service State access an application's states
Service Trigger access an application's observable events
ServiceRegistry Service access one of the system's applications

Using the Code

The code in this article has been edited to remove some things (function tracing and memory types) that are not relevant to the template's central purpose. These things appear in the full version of the code in the attached .zip file, so you can remove them if you're not using RSC.

Let's start by defining a type and null value for object identifiers. I chose 0 as the null value because

  • it's a legal array index (just in case it happens to get used as such), and
  • it always has the same bit pattern, even if it's truncated to a type that is smaller than uint32_t.
C++
typedef uint32_t id_t;      // identifier for an object in a Registry
constexpr id_t NIL_ID = 0;  // nil identifier

Objects stored in a Registry usually provide a RegCell member, which is used in two ways:

  • Before the object registers with a template instance, it sets its RegCell member to its fixed identifier. The template then registers it against that identifier.
  • If the object does not have a fixed identifier, it simply registers. The template assigns it the first available identifier and updates its RegCell member accordingly.
C++
//  Tracks the index at which an object was added to a registry's array.  An
//  object that resides in a registry usually includes this as a member and
//  implements a CellDiff function that returns the distance between the top
//  of the object and its RegCell member.  However, Registry also supports
//  registrants without RegCell members.
//
class RegCell
{
   template<class T> friend class Registry;
public:
   //  Until an object is registered, it has a nil identifier and has not
   //  been bound to the registry.
   //
   RegCell() : id(NIL_ID), bound(false) { }

   //  Before an object is destroyed, it should have been removed from the
   //  registry.
   //
   ~RegCell() { if(bound) Debug::SwLog("item is still registered", id); }

   //  Deleted to prohibit copying.
   //
   RegCell(const RegCell& that) = delete;
   RegCell& operator=(const RegCell& that) = delete;

   //  Before an object is registered, this function allow its index within
   //  the registry (and therefore its identifier) to be specified.  This is
   //  important for an object whose identifier must be fixed (because it is
   //  included in an interprocessor protocol, for example).
   //
   void SetId(id_t cid)
   {
      if(bound)
         Debug::SwLog("item already registered", pack2(id, cid));
      else
         id = cid;
   }

   //  Returns the object's index (identifier) within the registry.
   //
   id_t GetId() const { return id; }

   //  Returns a string for displaying the cell.
   //
   std::string to_str() const
   {
      auto str = std::to_string(id);
      if(!bound) str += " (not bound)";
      return str;
   }
private:
   //  The object's index (identifier) within the registry's array.
   //
   id_t id;

   //  Set when the object is added to the registry.  Cleared when the
   //  object is deregistered.
   //
   bool bound;
};

Now we can look at the Registry template itself, starting with its data and special member functions:

C++
//  A registry tracks objects derived from a common base class.  It uses
//  an array to save a pointer to each object that has been added to the
//  registry.  The array index at which an object is registered also acts
//  as an identifier for the object.  The first entry in the array is not
//  used and corresponds to NIL_ID (a nil object or nullptr).
//
template<class T> class Registry
{
public:
   //  Creates an empty registry.
   //
   Registry() : size_(0), capacity_(0), max_(0),
      diff_(NilDiff), delete_(false), registry_(nullptr) { }

   //  Deletes the objects in the registry (unless this action was not
   //  desired) and the deletes the registry's array.
   //
   ~Registry()
   {
      if((delete_) && (capacity_ > 0)) Purge();
      Memory::Free(registry_);
      registry_ = nullptr;
   }

   //  Deleted to prohibit copying.
   //
   Registry(const Registry& that) = delete;
   Registry& operator=(const Registry& that) = delete;
private:
   //  For initializing diff_.
   //
   static const ptrdiff_t NilDiff = -1;

   //  The number of items currently in the registry.
   //
   id_t size_;

   //  The current size of the registry.
   //
   id_t capacity_;

   //  The maximum size allowed for the registry.
   //
   id_t max_;

   //  The distance from a pointer to an item in the registry and its RegCell
   //  member.
   //
   ptrdiff_t diff_;

   //  Set if items in the registry should be deleted when overwritten or
   //  when the registry itself is deleted.
   //
   bool delete_;

   //  The registry, which is a dynamic array of pointers to registered items.
   //
   T** registry_;
};

Before a Registry can be used, its Init function must be invoked to set a few attributes:

C++
//  Allocates memory for the registry's array.  MAX is the maximum number
//  of objects that can register.  All objects must derive from the same
//  base class, with DIFF being the distance from the top of that base
//  class to the RegCell member that tracks an object's location in the
//  registry.  DEL is true if objects should be deleted when the registry
//  is deleted.  This is typical but would be prevented, for example, if
//  the objects were also added to another registry.  In such a case, the
//  objects would require one RegCell member per registry.  Alternatively,
//  the second version of the Insert function could be used, as it does
//  not require a RegCell member.
//
bool Init(id_t max, ptrdiff_t diff, bool del = true)
{
   if(registry_ != nullptr)
   {
      Debug::SwLog("already initialized", max_);
      return false;
   }
   if(diff < 0)
   {
      Debug::SwLog("no cell offset", diff);
      return false;
   }
   max_ = (max == 0 ? 0 : max + 1);
   diff_ = diff;
   mem_ = mem;
   delete_ = del;
   if(max_ == 0) return true;
   auto size = sizeof(T*) * ((max >> 3) + 2);
   registry_ = (T**) Memory::Alloc(size);
   capacity_ = (max >> 3) + 2;
   for(id_t i = 0; i < capacity_; ++i) registry_[i] = nullptr;
   return true;
}

To manipulate an object's RegCell member, Registry has to know where it is located. That is the purpose of diff_, which is the distance (in bytes) from the beginning of the object to its RegCell member. All the objects in the registry must use the same offset, so their common base class must define that offset.

The value of diff_ is returned by a grotty function that RSC classes always name CellDiff. For example, RSC's Parameter class is subclassed to define a flyweight for each parameter that can appear in a TLV-encoded message. It therefore needs a RegCell member and a CellDiff function that allows a Protocol class to initialize a Registry where the protocol's parameters will register. The outline (omitting other members) therefore looks like this:

C++
class Parameter : public Immutable
{
public:
   static const id_t MaxId = 63;
   static ptrdiff_t CellDiff() const;
protected:
   Parameter(id_t prid, id_t pid);  // protocol and parameter identifiers
private:
   RegCell pid_;
};

Parameter::Parameter(id_t prid, id_t pid)
{
   pid_.SetId(pid);
   auto reg = Singleton<ProtocolRegistry>::Instance();
   auto pro = reg->GetProtocol(prid);
   pro->BindParameter(*this);
}   

ptrdiff_t Parameter::CellDiff()
{
   uintptr_t local;
   auto fake = reinterpret_cast<const Parameter*>(&local);
   return ptrdiff(&fake->link_, fake);
}

class Protocol : public Immutable
{
public:
   bool BindParameter(Parameter& parameter) { return parameters_.Insert(parameter); }
private:
   Registry<Parameter> parameters_;
};

parameters_.Init(Parameter::MaxId, Parameter::CellDiff());  // in Protocol's constructor

Once Init has been invoked, the Registry function Cell can find the location of each element's RegCell. This function is only used internally and is therefore private:

C++
//  Returns ELEM's cell location.
//
RegCell* Cell(const T& item) const
{
   //  Ensure that the registry has been initialized with a RegCell offset.
   //
   if(diff_ <= 0)
   {
      Debug::SwLog("no cell offset", 0);
      return nullptr;
   }
   //  Ensure that ITEM is valid.
   //
   if(&item == nullptr)
   {
      Debug::SwLog("invalid item", 0);
      return nullptr;
   }
   return (RegCell*) getptr2(&item, diff_);  // adds diff_ to &item
}

A Registry owner provides functions for adding an item to, or removing it from, the registry. These functions (named Bind… and Unbind… by RSC classes) invoke the Insert and Erase functions provided by Registry:

C++
//  Adds ITEM, which contains a RegCell member, to the registry.
//
bool Insert(T& item)
{
   //  Ensure that ITEM has not already been registered.
   //
   auto cell = Cell(item);
   if(cell == nullptr) return false;
   if(cell->bound)
   {
      Debug::SwLog("already registered", cell->id);
      if((cell->id == NIL_ID) || (cell->id >= capacity_)) return false;
      return (registry_[cell->id] == &item);
   }
   //  If the item has a nil identifier, assign it to any available slot.
   //  If no slots remain, extend the size of the array.
   //
   if(cell->id == NIL_ID)
   {
      id_t start = 1;
      if(size_ + 1 >= capacity_)
      {
         start = capacity_;
         if(!Extend(capacity_)) return false;
      }
      for(auto i = start; i < capacity_; ++i)
      {
         if(registry_[i] == nullptr)
         {
            registry_[i] = &item;
            cell->id = i;
            cell->bound = true;
            ++size_;
            return true;
         }
      }
      return false;
   }
   //  If the item has a fixed identifier, assign it to that slot.  The
   //  array may first have to be extended.  If the slot is currently
   //  occupied, generate a log and delete the occupant unless this is
   //  a re-registration.
   //
   if(cell->id >= capacity_)
   {
      if(!Extend(cell->id)) return false;
   }
   if(registry_[cell->id] != &item)
   {
      if(registry_[cell->id] != nullptr)
      {
         if(delete_)
         {
            Debug::SwLog("identifier in use", cell->id);
            delete registry_[cell->id];
         }
         else
         {
            Erase(*registry_[cell->id]);
         }
      }
   }
   else
   {
      cell->bound = true;
      return true;
   }
   registry_[cell->id] = &item;
   cell->bound = true;
   ++size_;
   return true;
}
C++
//  Removes ITEM, which contains a RegCell member, from the registry.
//
bool Erase(T& item)
{
   //  If ITEM has been registered, remove it from the registry.
   //
   auto cell = Cell(item);
   if(cell == nullptr) return false;
   if(cell->id == NIL_ID) return false;
   if(cell->id >= capacity_)
   {
      Debug::SwLog("invalid cell", cell->id);
      return false;
   }
   if(registry_[cell->id] != &item)
   {
      Debug::SwLog("incorrect item", cell->id);
      return false;
   }
   registry_[cell->id] = nullptr;
   cell->id = NIL_ID;
   cell->bound = false;
   --size_;
   return true;
}

Both Insert and Erase have a second version that is used when registrants do not define a RegCell member. When a Registry contains items of this type, diff is set to 0 when invoking Init, and registrants are added and removed using the alternate functions:

C++
//  Adds ITEM to the registry in the slot specified by ID.  This
//  function is used with ITEM does not contain a RegCell member.
//
bool Insert(T& item, id_t id);

//  Removes ITEM from the slot specified by ID.  This function is
//  used when ITEM does not contain a RegCell member.
//
bool Erase(const T& item, id_t id);

Registry also provides typical container functions for accessing, iterating through, and counting items. Iteration usually takes the form

C++
for(auto item = reg_.First(); item != nullptr; reg_.Next(item))…

These functions are straightforward, so only their signatures are shown here:

C++
//  Returns the item registered against ID.
//
T* At(id_t id) const;

//  Returns the first item in the registry.
//
T* First() const;

//  Returns the first item at ID or higher.
//
T* First(id_t& id) const;

//  Updates ITEM to the next item in the registry.
//
void Next(T*& item) const;

//  Returns the first item that follows ITEM.
//
T* Next(const T& item) const;

//  Returns the first item that follows the slot identified by ID.
//
T* Next(id_t& id) const;

//  Returns the last item in the registry.
//
T* Last() const;

//  Updates ITEM to the previous item in the registry.
//
void Prev(T*& item) const;

//  Returns the first item that precedes ITEM.
//
T* Prev(const T& item) const;

//  Returns the number of items in the registry.
//
id_t Size() const;

//  Returns true if the registry is empty.
//
bool Empty() const;

Finally, there is the equivalent of clear:

C++
void Purge()
{
   for(id_t i = capacity_ - 1; i > 0; --i)
   {
      //  Clear the RegCell's bound flag so that its destructor won't
      //  generate a log.
      //
      auto& item = registry_[i];
      if(item != nullptr)
      {
         if(diff_ > 0)
         {
            auto cell = (RegCell*) getptr2(item, diff_);
            cell->bound = false;
         }
         delete item;
         item = nullptr;
         --size_;
      }
   }
}

History

  • 18th June, 2020: Initial version

License

This article, along with any associated source code and files, is licensed under The GNU General Public License (GPLv3)