Well, now it’s hard to remember the moment when I first realized that it’s actually a good idea to extract functions from the massive chunks of useful code. Either I got this knowledge from the “Complete Code” or “Clean Code” – it’s hard to recall. Actually, it does not matter so much. We all know that we should split the business logic into well named functions. The longest function I’ve seen in my life was longer than 5k lines. I’m personally acquainted with that “programmer”. I remember the time when I first faced that function. It’s not difficult to predict, that my very first reaction was – “WTF!!! Who made this piece of sh*t???” Yep, and that “programmer” is still hanging around in the office where I’m working. I don’t want to get deeper into this story, but I want to mention that the 5k-lines-function was the core of the ~150k-lines program. That program was eventually driven to a corner because of that piece of sh*t, which tremendously influenced the whole architecture of the application. In the end, the decision was made to rewrite that program completely from the ground.
This story illustrates one of the extremes of the “function-size problem”, which can lead to such unfortunate repercussions. The other extreme is to turn off your brain completely and start extracting classes everywhere with one-line functions inside. I don’t mean that one-liners are bad, what I’m talking about is to not forget to harness the power of your brain. There have to be some considerations in the first place about what you want to achieve.
Before I’ll try to investigate this problem further, I want to mention that actually there was quite an exciting episode of a battle (sort of) between Uncle Bob and Christin Gorman on the topic. Uncle Bob introduced the technique named “Extract till you drop”, which in short means – extract functions till there is nothing to extract anymore. Christin Gorman argued that this technique sounds like a no-brainer. For addition, there was a blog post by John Sonmez about refactoring of a .NET BCL-method using this technique (though the original intention was to show the comments, mostly, are bad).
Let’s consider the example of John’s refactoring. He took the following method:
internal static void SplitDirectoryFile(
string path, out string directory, out string file)
{
directory = null;
file = null;
if (path != null)
{
int length = path.Length;
int rootLength = GetRootLength(path);
if (length > rootLength && EndsInDirectorySeparator(path))
length--;
for (int pivot = length - 1; pivot >= rootLength; pivot--)
{
if (IsDirectorySeparator(path[pivot]))
{
directory = path.Substring(0, pivot);
file = path.Substring(pivot + 1, length - pivot - 1);
return;
}
}
directory = path.Substring(0, length);
}
return;
}
With the intention to make this code easier to read, John extracted a class and then extracted some functions. He ended up with the following class:
public class DirectoryFileSplitter
{
private readonly string validatedFullPath;
private int length;
private int rootLength;
private bool pivotFound;
public string Directory { get; set; }
public string File { get; set; }
public DirectoryFileSplitter(string validatedFullPath)
{
this.validatedFullPath = validatedFullPath;
length = validatedFullPath.Length;
rootLength = GetRootLength(validatedFullPath);
}
public void Split()
{
if (validatedFullPath != null)
{
IgnoreTrailingSlash();
FindPivotIndexBetweenEndOfStringAndRoot();
if(!pivotFound)
TrimDirectory();
}
}
private void TrimDirectory()
{
Directory = validatedFullPath.Substring(0, length);
}
private void FindPivotIndexBetweenEndOfStringAndRoot()
{
for (int pivot = length - 1; pivot >= rootLength; pivot--)
{
if (IsDirectorySeparator(validatedFullPath[pivot]))
{
Directory = validatedFullPath.Substring(0, pivot);
File = validatedFullPath.Substring(pivot + 1, length - pivot - 1);
pivotFound = true;
}
}
}
private void IgnoreTrailingSlash()
{
if (length > rootLength && EndsInDirectorySeparator(validatedFullPath))
length--;
}
}
Wow, yep? It’s not so simple to decide whether the refactoring helped to make the code easier to read or not. Seems like it became harder to read, actually. There was a relatively short function with useful comments, which was turned into a class with four functions without comments. I would not say that the new class is a bad class and the overall refactoring was a mistake and the programmer who did this must be executed. Not at all. I’m not so bloodthirsty. There are a couple of differences between these two code examples. Let’s consider them:
- If you are trying to deeply understand what the higher-level function does, then this function now is harder to read, than it was originally, because you have to hop over all functions and understand what’s going on inside each of them. Conversely, the original example is simpler to quickly skim through.
- If you are trying to understand the overall concept of the original function, then it’s refactored counterpart is better, cause we immediately see what conceptually this function does inside of itself.
- The third difference I see here is the maintenance cost. Considering our concrete example, I’d say that the refactored version is more expensive, than the first (at least you have to spend some time to do refactoring). Generally, the answer to the question of which example is more expensive in maintenance lies in the field of requirements. Those requirements dictate to us whether it is so important in this situation to adhere to SRP (Single Responsibility Principle) or not. If this function can be once written and forgotten till the end of times, than there is no reason to extract additional class with four methods inside, there are no reasons to spend time on refactoring. In contrast, if the functionality is expected to grow over time, then you have all reasons to refactor it into a class.
In addition, I want to address the situation when you accidentally (or deliberately) come across the similar function in a legacy system. Are you going to rush for extracting the class with four functions? My advice – don’t do that without a reason, even if your code base test coverage is closer to 100%. Why? Because there is no technical debt. I’m talking about a solid technical debt which causes the real harm.
So, there is nothing wrong with the “extract till you drop” technique. You just have to keep in mind some considerations, in my opinion.
To sum up, I want to state that you shouldn’t do mindless acts. You should think at first, analyze, make the conclusion and only after that – act.
Filed under: Best Practices, CodeProject, Design, Refactoring Tagged: extract till you drop