Techniques for Avoiding Code Duplication

Alasdair Craig

4.86/5 (39 votes)

15 Feb 2016CPOL5 min read

53K

191

Common ways that code gets duplicated, and techniques for avoiding or refactoring the duplication

Download source - 4.8 KB

Introduction

We all know that duplicated code is generally bad, and “Don’t Repeat Yourself” is a well-intended principle that is often mentioned in this regard.

Duplication leads to bugs of the sort “I fixed it over here, but forgot to over there.” Over time, the duplication might become varied slightly such that it’s not even clear which version is the correct version. Ideally, the duplicated code can be simply extracted to one function to be called from multiple places. However, sometimes fixing the duplication is not so simple or obvious.

The purpose of this article is to demonstrate five such non-obvious fixes using C++. All the examples are contrived, but serve to demonstrate typical real world duplication.

Overview

Functions are generally the same, but internally call different functions.
- Use function pointers or pointers to member functions
Functions are generally the same, but have parameters of, or return different types.
- Use function templates
Functions contain repeated small blocks of code.
- Use lambda expressions
Constructor overrides are very similar.
- Use delegating constructors
Extracting a function would lead to an undesirably long parameter list.
- Consider using std::bind

Example 1: Functions are Generally the Same, But Internally Call Different Functions

Here, we have two functions that are very similar. The duplication is because the important difference is at the inner most level, so extracting a function to do the common code isn't immediately obvious.

C++

void MenuHandler::Function1()
{
	if (true) // various preconditions
		{
		size_t id = 0;
		int local = 0;
		if (true) // lookup and settings
			{
			MyObject myObject = objects.at(id);
			myObject.DoSomeWork1(local);
			}
		}
	// cleanup
}

void MenuHandler::Function2()
{
	if (true) // various preconditions
		{
		size_t id = 0;
		int local = 0;
		if (true) // lookup and settings
			{
			MyObject myObject = objects.at(id);
			myObject.DoSomeWork2(local);
			}
		}
	// cleanup
}

The Problem

In time Function3, 4, 5, etc. may be added thereby exacerbating the duplication.

A Solution

Pointers-to-member-functions are used. The code is first extracted to its own member function that takes as a parameter a pointer to the member function of MyObject that needs to be called at the inner level. Function1 and Function2 then become thin wrappers to the extracted generic function.

C++

void MenuHandler::GenericWorkFunc(void (MyObject::*workFunc)(int))
{
	if (true) // various preconditions
		{
		size_t id = 0;
		int local = 0;
		if (true) // lookup and settings
			{
			MyObject myObject = objects.at(id);
			(myObject.*workFunc)(local);
			}
		}
	// cleanup
}

void MenuHandler::Function1()
{
	GenericWorkFunc(&MyObject::DoSomeWork1);
}

void MenuHandler::Function2()
{
	GenericWorkFunc(&MyObject::DoSomeWork2);
}

Example 2: Functions are Generally the Same, But Have Parameters Of, Or Return Different Types

Here, we have two member functions dealing with dialog class instances. The functions are duplicated because they operate on instances of different types that don't inherit a common base class. Using inheritance then is not an option, with this being an intended design choice for whatever reason.

C++

void MyObject::InitDialog1()
{
	if (m_dialogType1)
		{
		if (m_dialogType1->m_spreadsheet)
			{
			m_dialogType1->m_spreadsheet->Show();
			return;
			}
		else
			{
			delete m_dialogType1;
			}
		}
	m_dialogType1 = new CDialogType1();
}

void MyObject::InitDialog2()
{
	if (m_dialogType2)
		{
		if (m_dialogType2->m_spreadsheet)
			{
			m_dialogType2->m_spreadsheet->Show();
			return;
			}
		else
			{
			delete m_dialogType2;
			}
		}
	m_dialogType2 = new CDialogType2();
}

The Problem

The member functions have the same structure and purpose, but differ only on the hardcoded type and instance they operate on. If the structure of one ever changes, the other will presumably have to change accordingly as well. This is unsatisfactory from a maintenance point of view. For example, a third and fourth type might get added in the future, or the definition changes, or needs additional checks. With the functions duplicated, this maintenance becomes onerous.

A Solution

Use template programming, specifically member function templates. The common code gets extracted into one definition with template parameter T. As long as T can get expanded into a class that will compile, the generic code will build. I call it generic to indicate that it's not hardcoded to any one type, but rather that the type is left until the template is instantiated.

C++

template <typename T>
void MyObject::GenericInitDialog(T *&dialog)
{
	if (dialog)
		{
		if (dialog->m_spreadsheet)
			{
			dialog->m_spreadsheet->Show();
			return;
			}
		else
			{
			delete dialog;
			}
		}
	dialog = new T();
}

Example 3: Functions Contain Repeated Small Blocks of Code

Here, we have a class member function that has a small code block used very similarly four times. Extracting the repeated code block to a new member function might not be wanted because it's considered overkill for a simple operation not used anywhere else.

C++

void MyObject::DrawVertexMarker(CView *pView)
{
	CVertex vertex;
	if (GetVertex(m_polygon1, GetIndex1(), vertex))
		Utils::DrawVertexMarker(pView, vertex);

	if (m_bShowSecondary)
		{
		if (m_method == EDrawOption1)
			{
			CVertex vertex2;
			if (GetVertex(m_polygon2, GetIndex2(), vertex2))
				Utils::DrawVertexMarker(pView, vertex2);
			}
		else
			{
			CVertex startVertex;
			if (GetVertex(m_polygon2, GetStartIndex(), startVertex))
				Utils::DrawVertexMarker(pView, startVertex);

			CVertex endVertex;
			if (GetVertex(m_polygon2, GetEndIndex(), endVertex))
				Utils::DrawVertexMarker(pView, endVertex);
			}
		}
}

The Problem

The important differences between the four uses is obscured by the implementation details. It's at risk of further duplication if the code blocks get subsequently modified.

A Solution

Use lambda expressions (C++11). Move the duplicated code into a lambda expression to define in one place a callable object local to the function that can be reused each of the four times.

C++

void MyObject::DrawVertexMarker(CView *pView)
{
	auto drawMarker = [=](CPolygon *polygon, int selection)
		{
		CVertex vertex;
		if (GetVertex(polygon, selection, vertex))
			Utils::DrawVertexMarker(pView, vertex);
		};

	drawMarker(polygon1, GetIndex1());
	if (m_bShowSecondary)
		{
		if (m_method == EDrawOption1)
			{
			drawMarker(m_polygon2, GetIndex2());
			}
		else
			{
			drawMarker(m_polygon2, GetStartIndex());
			drawMarker(m_polygon2, GetEndIndex());
			}
		}
}

Example 4: Constructor Overrides Are Very Similar

Here, we have a property page class with two constructors. They differ in whether it's default constructed or given parameters.

C++

class MyPage : public PropertyPage
{
	MyObject *m_pObject;
	MyData *m_pData;
	MyResource *m_pResource;

	MyPage();
	MyPage(MyObject *pObject, MyData *pData);
};

MyPage::MyPage()
	: PropertyPage(MyPage::IDD)
	, m_pObject(nullptr)
	, m_pData(nullptr)
	, m_pResource(nullptr)
{
	// initialization...
}

MyPage::MyPage(MyObject *pObject, MyData *pData)
	: PropertyPage(MyPage::IDD)
	, m_pObject(pObject)
	, m_pData(pData)
	, m_pResource(nullptr)
{
	// initialization...
}

The Problem

Two constructors that need to be kept in sync.

A Solution

Use delegating constructors (C++11). Let the simpler default constructor delegate to the two-parameter constructor with appropriate initial values. One place to modify, one place to maintain.

C++

MyPage::MyPage()
	: MyPage(nullptr, nullptr)
{}

MyPage::MyPage(MyObject *pObject, MyData *pData)
	: PropertyPage(LinkMyPage::IDD)
	, m_pObject(pObject)
	, m_pData(pData)
	, m_pResource(nullptr)
{
	// initialization...
}

Example 5: Extracting a Function Would Lead to an Undesirably Long Parameter List

Here, we have calls to a function where most of the parameters are the same and only one differs between the calls. This is very typical with the "Extract Function" refactoring where a repeated code block is moved into its own function for reuse. Local scope variables must get passed in as parameters and they're mostly the same for each call.

C++

WorkFunc(1, paramB, paramC, paramD, paramE, paramF, paramG);
WorkFunc(2, paramB, paramC, paramD, paramE, paramF, paramG);
WorkFunc(3, paramB, paramC, paramD, paramE, paramF, paramG);

The Problem

Despite being a very worthwhile refactoring, it often leads to a long parameter list that is unwieldy and obscures the readability of the calling function.

A Solution

std::bind is an option that could be considered. This binds arguments to the function to be called and returns a callable object to be invoked instead. The callable object has parameters for only the ones that differ.

C++

using namespace std::placeholders;
auto doWork = std::bind(WorkFunc, _1, paramB, paramC, paramD, paramE, paramF, paramG);

doWork(1);
doWork(2);
doWork(3);

What can be done with std::bind can also be done with lambdas, which is normally the better way to go. However long lambdas are inadvisable and then std::bind might be preferable. The alternative using a lambda expression would be as follows:

C++

auto doWork = [=](int task)
{
    WorkFunc(task, paramB, paramC, paramD, paramE, paramF, paramG);
};

doWork(1); 
doWork(2); 
doWork(3);

Conclusion

These are merely suggestions for minimizing duplication, techniques that I find useful. But it's opinion-based and coding style is subjective. Readability, project coding conventions, and team buy-in are equally important.

The emphasis for reducing duplication is not at all for reducing lines of code, but for reducing the opportunity for bugs, and to make maintaining easier. Don't Repeat Yourself.

History

15^th February, 2016: Initial article

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)

Techniques for Avoiding Code Duplication

Introduction

Overview

Example 1: Functions are Generally the Same, But Internally Call Different Functions

The Problem

A Solution

More Reading

Example 2: Functions are Generally the Same, But Have Parameters Of, Or Return Different Types

The Problem

A Solution

More Reading

Example 3: Functions Contain Repeated Small Blocks of Code

The Problem

A Solution

More Reading

Example 4: Constructor Overrides Are Very Similar

The Problem

A Solution

More Reading

Example 5: Extracting a Function Would Lead to an Undesirably Long Parameter List

The Problem

A Solution

More Reading

Conclusion

History

License