Introduction
We all know that duplicated code is generally bad, and “Don’t Repeat Yourself” is a well-intended principle that is often mentioned in this regard.
Duplication leads to bugs of the sort “I fixed it over here, but forgot to over there.” Over time, the duplication might become varied slightly such that it’s not even clear which version is the correct version. Ideally, the duplicated code can be simply extracted to one function to be called from multiple places. However, sometimes fixing the duplication is not so simple or obvious.
The purpose of this article is to demonstrate five such non-obvious fixes using C++. All the examples are contrived, but serve to demonstrate typical real world duplication.
Overview
- Functions are generally the same, but internally call different functions.
- Use function pointers or pointers to member functions
- Functions are generally the same, but have parameters of, or return different types.
- Functions contain repeated small blocks of code.
- Constructor overrides are very similar.
- Use delegating constructors
- Extracting a function would lead to an undesirably long parameter list.
Example 1: Functions are Generally the Same, But Internally Call Different Functions
Here, we have two functions that are very similar. The duplication is because the important difference is at the inner most level, so extracting a function to do the common code isn't immediately obvious.
void MenuHandler::Function1()
{
if (true) {
size_t id = 0;
int local = 0;
if (true) {
MyObject myObject = objects.at(id);
myObject.DoSomeWork1(local);
}
}
}
void MenuHandler::Function2()
{
if (true) {
size_t id = 0;
int local = 0;
if (true) {
MyObject myObject = objects.at(id);
myObject.DoSomeWork2(local);
}
}
}
The Problem
In time Function3
, 4
, 5
, etc. may be added thereby exacerbating the duplication.
A Solution
Pointers-to-member-functions are used. The code is first extracted to its own member function that takes as a parameter a pointer to the member function of MyObject
that needs to be called at the inner level. Function1
and Function2
then become thin wrappers to the extracted generic function.
void MenuHandler::GenericWorkFunc(void (MyObject::*workFunc)(int))
{
if (true) {
size_t id = 0;
int local = 0;
if (true) {
MyObject myObject = objects.at(id);
(myObject.*workFunc)(local);
}
}
}
void MenuHandler::Function1()
{
GenericWorkFunc(&MyObject::DoSomeWork1);
}
void MenuHandler::Function2()
{
GenericWorkFunc(&MyObject::DoSomeWork2);
}
More Reading
Example 2: Functions are Generally the Same, But Have Parameters Of, Or Return Different Types
Here, we have two member functions dealing with dialog class instances. The functions are duplicated because they operate on instances of different types that don't inherit a common base class. Using inheritance then is not an option, with this being an intended design choice for whatever reason.
void MyObject::InitDialog1()
{
if (m_dialogType1)
{
if (m_dialogType1->m_spreadsheet)
{
m_dialogType1->m_spreadsheet->Show();
return;
}
else
{
delete m_dialogType1;
}
}
m_dialogType1 = new CDialogType1();
}
void MyObject::InitDialog2()
{
if (m_dialogType2)
{
if (m_dialogType2->m_spreadsheet)
{
m_dialogType2->m_spreadsheet->Show();
return;
}
else
{
delete m_dialogType2;
}
}
m_dialogType2 = new CDialogType2();
}
The Problem
The member functions have the same structure and purpose, but differ only on the hardcoded type and instance they operate on. If the structure of one ever changes, the other will presumably have to change accordingly as well. This is unsatisfactory from a maintenance point of view. For example, a third and fourth type might get added in the future, or the definition changes, or needs additional checks. With the functions duplicated, this maintenance becomes onerous.
A Solution
Use template programming, specifically member function templates. The common code gets extracted into one definition with template parameter T
. As long as T
can get expanded into a class that will compile, the generic code will build. I call it generic to indicate that it's not hardcoded to any one type, but rather that the type is left until the template is instantiated.
template <typename T>
void MyObject::GenericInitDialog(T *&dialog)
{
if (dialog)
{
if (dialog->m_spreadsheet)
{
dialog->m_spreadsheet->Show();
return;
}
else
{
delete dialog;
}
}
dialog = new T();
}
More Reading
Example 3: Functions Contain Repeated Small Blocks of Code
Here, we have a class member function that has a small code block used very similarly four times. Extracting the repeated code block to a new member function might not be wanted because it's considered overkill for a simple operation not used anywhere else.
void MyObject::DrawVertexMarker(CView *pView)
{
CVertex vertex;
if (GetVertex(m_polygon1, GetIndex1(), vertex))
Utils::DrawVertexMarker(pView, vertex);
if (m_bShowSecondary)
{
if (m_method == EDrawOption1)
{
CVertex vertex2;
if (GetVertex(m_polygon2, GetIndex2(), vertex2))
Utils::DrawVertexMarker(pView, vertex2);
}
else
{
CVertex startVertex;
if (GetVertex(m_polygon2, GetStartIndex(), startVertex))
Utils::DrawVertexMarker(pView, startVertex);
CVertex endVertex;
if (GetVertex(m_polygon2, GetEndIndex(), endVertex))
Utils::DrawVertexMarker(pView, endVertex);
}
}
}
The Problem
The important differences between the four uses is obscured by the implementation details. It's at risk of further duplication if the code blocks get subsequently modified.
A Solution
Use lambda expressions (C++11). Move the duplicated code into a lambda expression to define in one place a callable object local to the function that can be reused each of the four times.
void MyObject::DrawVertexMarker(CView *pView)
{
auto drawMarker = [=](CPolygon *polygon, int selection)
{
CVertex vertex;
if (GetVertex(polygon, selection, vertex))
Utils::DrawVertexMarker(pView, vertex);
};
drawMarker(polygon1, GetIndex1());
if (m_bShowSecondary)
{
if (m_method == EDrawOption1)
{
drawMarker(m_polygon2, GetIndex2());
}
else
{
drawMarker(m_polygon2, GetStartIndex());
drawMarker(m_polygon2, GetEndIndex());
}
}
}
More Reading
Example 4: Constructor Overrides Are Very Similar
Here, we have a property page class with two constructors. They differ in whether it's default constructed or given parameters.
class MyPage : public PropertyPage
{
MyObject *m_pObject;
MyData *m_pData;
MyResource *m_pResource;
MyPage();
MyPage(MyObject *pObject, MyData *pData);
};
MyPage::MyPage()
: PropertyPage(MyPage::IDD)
, m_pObject(nullptr)
, m_pData(nullptr)
, m_pResource(nullptr)
{
}
MyPage::MyPage(MyObject *pObject, MyData *pData)
: PropertyPage(MyPage::IDD)
, m_pObject(pObject)
, m_pData(pData)
, m_pResource(nullptr)
{
}
The Problem
Two constructors that need to be kept in sync.
A Solution
Use delegating constructors (C++11). Let the simpler default constructor delegate to the two-parameter constructor with appropriate initial values. One place to modify, one place to maintain.
MyPage::MyPage()
: MyPage(nullptr, nullptr)
{}
MyPage::MyPage(MyObject *pObject, MyData *pData)
: PropertyPage(LinkMyPage::IDD)
, m_pObject(pObject)
, m_pData(pData)
, m_pResource(nullptr)
{
}
More Reading
Example 5: Extracting a Function Would Lead to an Undesirably Long Parameter List
Here, we have calls to a function where most of the parameters are the same and only one differs between the calls. This is very typical with the "Extract Function" refactoring where a repeated code block is moved into its own function for reuse. Local scope variables must get passed in as parameters and they're mostly the same for each call.
WorkFunc(1, paramB, paramC, paramD, paramE, paramF, paramG);
WorkFunc(2, paramB, paramC, paramD, paramE, paramF, paramG);
WorkFunc(3, paramB, paramC, paramD, paramE, paramF, paramG);
The Problem
Despite being a very worthwhile refactoring, it often leads to a long parameter list that is unwieldy and obscures the readability of the calling function.
A Solution
std::bind
is an option that could be considered. This binds arguments to the function to be called and returns a callable object to be invoked instead. The callable object has parameters for only the ones that differ.
using namespace std::placeholders;
auto doWork = std::bind(WorkFunc, _1, paramB, paramC, paramD, paramE, paramF, paramG);
doWork(1);
doWork(2);
doWork(3);
What can be done with std::bind
can also be done with lambdas, which is normally the better way to go. However long lambdas are inadvisable and then std::bind
might be preferable. The alternative using a lambda expression would be as follows:
auto doWork = [=](int task)
{
WorkFunc(task, paramB, paramC, paramD, paramE, paramF, paramG);
};
doWork(1);
doWork(2);
doWork(3);
More Reading
Conclusion
These are merely suggestions for minimizing duplication, techniques that I find useful. But it's opinion-based and coding style is subjective. Readability, project coding conventions, and team buy-in are equally important.
The emphasis for reducing duplication is not at all for reducing lines of code, but for reducing the opportunity for bugs, and to make maintaining easier. Don't Repeat Yourself.
History
- 15th February, 2016: Initial article