A Method Object Helper - Putting More Class in Method Refactoring

George Henry 1954

4.94/5 (5 votes)

19 Feb 2014CPOL25 min read

20K

103

Describes an innovative approach, aided by a small library, that lends itself to easy, successful use of the replace method with method object refactoring pattern.

Download source - 33.4 KB

Introduction

The method-refactoring strategy dubbed Replace Method with Method Object is very useful and merits a quick review, to set the background for the discussion to follow. There are many good explanations of the strategy available on the Web. My understanding and appreciation of it was deepened by reading the article at the preceding link.

The problem that motivates using this strategy is, "a long method [or code sequence within a method] that uses [many] local variables" so that one cannot readily decompose it into many smaller methods that have short parameter lists.

Closely related is the situation in which existing methods have too many parameters. A couple of other approaches can be used to reduce the numbers of parameters passed:

The Introduce Parameter Object strategy entails grouping a set of parameters that "naturally go together" into a new class or structure. In theory, after doing this, one should notice behavior that also naturally goes with the data, which can lead to additional beneficial refactoring.
The Preserve Whole Object strategy suggests that if several properties of an object are being passed as parameters to a method, why not pass the entire object?

If neither of the above approaches seems satisfactory, "replace method with method object" can be engaged as "heavy artillery" that will help resolve even the stickiest situations. This article and the associated code are aimed at helping with the application of this strategy.

Background

I tend to have reservations about the "preserve whole object" strategy, because there is a huge difference between passing a reference to an object and passing a snapshot of selected property values from that object. If I pass (by value) a snapshot of property values, the called method can do what it wants with that limited amount of data without risk of inadvertently changing the state of the original object; but if I pass a reference to the original object, the called method can cause all sorts of havoc. Taking into account factors like collaborating with other developers in real time and multiple people making changes to the code over a possibly long period of time, it just seems safer to me to expose to methods only the data they genuinely need to access.

"Introduce parameter object" runs up against a wall when the number of parameters that one wants to group together becomes large. This same wall is actually incorporated as a "feature" of the "replace method with method object" strategy as it was originally articulated: "Give the new class a constructor that takes the source object and each parameter." If that constructor is going to have to take more than some small number n - probably around n=5, as a matter of personal taste - parameters, no, that is not something that I want to do. While calling a constructor (or any method) that's excessively endowed with parameters may be viewed as a necessarily sub-optimal part of an overall improvement to be yielded by applying a given strategy, it is also going to involve trading one "code smell" for another, and one might earnestly hope for a better alternative. "Too many parameters" is one of the problems that I definitely want to solve - not one that I am willing to embrace as part of a long-term solution to other problems.

It seems clear (to me) that one of the fundamental questions that should be dealt with, but that is frequently glossed over, is, "What is wrong with a method (or constructor) having a lengthy parameter list? Why does this make code hard to read, understand and maintain?" Complaining about inherent code complexity and low level of abstraction may have its place, but for the purposes of this discussion I will assume that there isn't much we can do about those problems. So we have some large amount of data that needs to be dealt with on a piecemeal basis at some point, or perhaps multiple points. At the very least, we need to stuff all of that data, all at once, into an object so that it is more convenient to work with.

C#'s object initializer syntax can be an aid, along with good naming practice. Let's assume that we are creating a new "parameter object" class and have control of naming properties. Further, we understand the code and can develop meaningful property names, yielding good naming that promotes understanding for the next person who comes along, or for ourselves a few days / weeks / months in the future. Then we can associate each parameter (in its new incarnation as a property) with a name that gives it as much meaning as possible in the context of the new class that we are creating. The same applies if we are creating a "method object" that's going to help us refactor a long, complicated method that already takes too many parameters, our would end up doing so if we used a less potent approach to refactoring.

So, because we are creating a method object or parameter object and giving its properties very meaningful names, by using object initializer syntax instead of passing positional parameters to a method, we're able to display more information to the reader of our code - specifically, strong hints about what the bits of data we're collecting together will mean and how they will be used in the new context of the method object.

To ensure that this is point is made clear, consider this method signature that I just grabbed from a question posted on Code Review Stack Exchange that concerns too many parameters...?:

XmlNode LastNArticles(
    int NumArticles,
    int Page,
    String Category = null, 
    String Author = null,
    DateTime? Date = null,
    String Template = null,
    ParseMode ParsingMode = ParseMode.Loose,
    PageParser Parser = null);

A call to the method might look like this:

var node = LastNArticles(20, 5, Categories.Miscellaneous, currentAuthor, selectedDate, null,
    ParseMode.Strict, selectedParser);

Regardless of whether such a call is considered "clear enough" or not by someone who already has some knowledge of the code and the application of which it is a part, wouldn't it be easier to understand if we could, and did, specify explicitly what parameter each argument value gets assigned to? One could write:

var node = LastNArticles(
    NumArticles: 20,
    Page: 5,
    Category: Categories.Miscellaneous,
    Author: currentAuthor,
    Date: selectedDate,
    ParseMode: ParseMode.Strict,
    Parser: selectedParser);

This is already legal C# syntax, although we aren't usually encouraged to use it for the purpose illustrated above. (Notice that we can omit the initialization of Template because the method signature specified a default value for that parameter. With the parameters explicitly named, we could also rearrange their order, listing them alphabetically or according to some other scheme that makes it easier to look up the value to which a given parameter was initialized.)

And this looks very much like object initializer syntax, the biggest difference between the two being whether one uses an assignment operator or a colon to match the parameter names with the values being assigned.

If we were constructing a parameter object for this method, it might be called ArticleSearchParameters and might be initialized like this:

var parameters = new ArticleSearchParameters
{
    Author = currentAuthor,
    Category = Categories.Miscellaneous,
    Date = selectedDate,
    NumArticles = 20,
    Page = 5,
    ParseMode = ParseMode.Strict,
    Parser = selectedParser
};

Although this is to essentially the same effect in terms of providing information to both code and the reader as the method call shown earlier, it is syntax that we are more used to seeing in C# - and that won't be flagged as "redundant" by style-checking tools!

Before we rush off to start creating new classes and structures to replace lengthy parameter lists, we do well to pause and consider that the semantics surrounding method parameters are significantly richer than those related to object creation. With methods, we have in, ref and out parameters, and the method signature also explicitly states which parameters are required and which are optional, with default values supplied if argument values aren't passed to the optional parameters. And we give up all of that, when we use either of the "introduce parameter object" or the "replace method with method object" strategies!

"What can be done so that we don't have to give up the advantages of method specification and call semantics in moving to the use of objects to pass parameters?" was the question that led me down a path of design and development that led to the creation of something potentially useful: a MethodObjectHelper class that provides specific purposeful infrastructure, along with a small number of custom attribute and exception types; and an associated standard pattern for creating a robust method object that doesn't require giving up the benefits associated with method signatures and calls as an unfortunate side-effect. Admittedly, in this approach, the analogs to ref and out parameters are imperfect; and in fact, everything requires just a bit more thought, ceremony and typing than specifying method signatures and calls (which are rendered economical via their inclusion in C# as direct syntactic features of the language) - but I think the code and associated patterns may be of benefit. So, I am hereby publishing this "system," and will see how many in the C# development community agree with me. And I don't see any reason why VB.NET developers couldn't also use this, with appropriate translation. :)

I'll put the MethodObjectHelper class front and center, and call what I am going to describe the MOH-assisted approach to the "replace method with method object" process and outcome.

MethodObjectHelper and the associated attributes, exceptions and code artifact miscellanea are available as a download, and if I were to try to explain how they work in detail, this would become a very long article (that no one would read). Instead, I am going to explain how to use them to build a MOH-assisted method object class in a very straightforward, algorithmic (clearly defined, step-by-step) manner.

Using the code

Step One: Construct a method signature, if you don't already have one. Please consider all of the usual options and use them freely:

Which parameters, if any, need to be declared as ref or out parameters?
Which parameters must be initialized by the calling code, and which ones can be given default values so that initialization by the calling code is optional?
What should the return value and type of the method be, if any?

If you already have a Big Ugly Method (BUM) that takes far too many parameters, so much the better, because you can skip a step, or at least you are somewhat ahead of the game. If the existing method is currently referencing values in its context that are currently "global" from its perspective, access to those will have to be preserved by adding them to its parameter list. (Things may have to apparently get worse before they get better.)

Step Two: Construct a class in which each of the parameters to the BUM resulting from the first step becomes a property. It is appropriate, at this stage, to actually start coding the class; just create a property with the same type and name as each of the BUM's parameters. Give them all the public access modifier - although they won't actually be visible to external code at run-time under normal circumstances, due to a trick of the implementation that will be explained later. In parameters need only a get accessor - for now, just trust me on that point. Ref and out parameters should have both get and set accessors. Unless your method signature specifies the return type as void, also create a property, called Result or ReturnValue, representing the return value.

For now, make all of the accessors empty; we'll do something special with them in a later step.

Step Three: Decorate the properties created in step two with attributes that will define how they are used. In and ref parameter properties need to be decorated with the ParameterProperty attribute. Ref and out parameter properties must be decorated with the ResultProperty attribute. Notice that ref parameters have to be decorated with both attributes.

The ResultProperty attribute is parameterless.

ParameterProperty has one required parameter, which sets a Boolean property called MustBeInitialized.

If you pass true, the decorated property must be initialized when the method object is used - which I call its invocation or activation time; like a method, (unless it ends up being treated as data container, which is not particularly recommended, although possible) the method object will have a very brief lifetime.
If you pass false, the decorated property can be left uninitialized by the code that invokes / activates the method object.

As with optional method parameters, each optionally-initialized property should have a default value, which can be specified in one of two ways:

The most straightforward way is to set the ParameterProperty attribute's DefaultValue property to the desired default value for the decorated property. This value can be "a constant expression, typeof expression or array creation expression" (to quote the error message that appears if you try to assign anything else) that is consistent with the attribute property's type.
If those restrictions are too confining, any default value that can be assigned to the decorated property can be set in code. There is a particular point in the flow of control where it makes sense to do this, which I will mention at the appropriate point further along in these instructions. If you find it necessary to resort to this technique, it will be a good idea to assign true to the ParameterProperty attribute's DefaultValueSetInCode property, which has a documentary purpose: It redirects a reader of the code to look for the default-setting code in the appropriate and customary place, and it serves notice that the developer didn't just (sloppily, and potentially disastrously) omit assigning a default value to the property.

Step Four: As you may have guessed, the ParameterProperty and ResultProperty attributes serve practical, functional purposes, in that code in the MethodObjectHelper class actually uses them at run-time. To use the MOH-assisted approach that I am here presenting, it is essential that every method object have an instance of the MethodObjectHelper class. The helper instance is inserted via composition, using this code (please copy and paste):

/// <summary>
/// Provides input, output and associated validation services.
/// </summary>
protected readonly MethodObjectHelper _methodObjectHelper = new MethodObjectHelper();

The most significant "why?" question likely to arise when considering that line of code is, "Why does the field have protected access?" The answer is that method object classes can be base classes (inherited or derived from) and every method object instance requires exactly one MethodObjectHelper instance.

Step Five: The method object must have an explicit default constructor. Here again, I will ask you to copy and paste:

/// <summary>
/// Initializes a new instance of the <see cref="MethodObject"/> class.
/// The protected access modifier prevents instantiation by classes that are not derived from this class.
/// </summary>
protected MethodObject()
{
}

Of course, please replace the instances of "MethodObject" with the name of your class, which I hope is very descriptive. As guidance on naming your class, consider this example: If the method you are replacing was named, or would have been named, Gonkulate, your method object class should be named Gonkulator. :)

Step Six: Use a method that extracts the name of a property from an expression that accesses the property to provide a complete set of property names as public static readonly strings. Here is an example:

/// <summary>
/// Provides the name of the <see cref="Name"/> property as a <see cref="String"/>.
/// </summary>
public static readonly string NameOfName = MemberInfoHelper.GetMemberName(() => Instance.Name);

The names of all of the parameter and result properties that you have defined should be provided in a similar manner.

In executing this step, you will also need to have a static instance of your class available for the property name extractor method to work with. Here's the code for that instantiation (please copy and paste, then replace the class name):

/// <summary>
/// Facilitates initialization of the static readonly property names; probably
/// shouldn't be used for other purposes. 
/// </summary>
private static readonly MethodObject Instance = new MethodObject();

Step Seven: The time has come to write the special method that provides public access to the functionality of the method object class. I suggest creating two overloads which use different types of data transfer objects to help with input and output, as well as to enforce the requirements that were expressed via attributes when the the parameter and result properties were developed. I prefer the name Invoke for this method; you could also call it something else if you prefer - but please be consistent. :)

The overloads of Invoke (by whatever name) are short boilerplate-ish methods that call the Activate method in the helper object, which does all of the interesting and useful infrastructure tasks, like validating the data transfer objects, initializing parameters, binding output data (ref and out parameters, essentially) to the calling context, and invoking the single entry point method that is provided by the method object - which a future step deals with; don't worry about it now.

I will provide the code for both overloads here (with most intra-code comments removed, for brevity) so that you can compare and contrast them. The only aspect in which your implementation should vary from the code given here is in the parameter value defaulting code that follows the verification that the first argument isn't null.

public static string Invoke(
    Dictionary<string, object> parameterValueByName,
    Action<Dictionary<string, object>> bindResultValueByName = null)
{
    if (parameterValueByName == null)
    {
        throw new ArgumentNullException(MethodObjectHelper.ParameterInitializer);
    }

    // Parameter value defaulting code:
    if (!parameterValueByName.ContainsKey(NameOfTimestamp))
    {
        parameterValueByName.Add(NameOfTimestamp, DateTime.Now);
    }

    var instance = new MethodObject();
    instance._methodObjectHelper.Activate(instance, parameterValueByName, bindResultValueByName);
    return instance.Result;
}

public static string Invoke(
    dynamic parameterPackage,
    Action<dynamic> bindResultPackage = null)
{
    if (parameterPackage == null)
    {
       throw new ArgumentNullException(MethodObjectHelper.ParameterInitializer);
    }

    // Parameter value defaulting code:
    if (!DynamicHelper.ContainsProperty(parameterPackage, NameOfTimestamp))
    {
        parameterPackage = DynamicHelper.SetPropertyValue(parameterPackage, NameOfTimestamp, DateTime.Now);
    }

    var instance = new MethodObject();
    instance._methodObjectHelper.Activate(instance, parameterPackage, bindResultPackage);
    return instance.Result;
}

There is a rationale for providing both overloads, with their characteristic data transfer object types. Briefly: It is safer to use Dictionary<string, object> objects but more convenient (and readable) to use anonymous, dynamic objects. Please compare these short examples, which display the two techniques:

var returnValue = MethodObject.Invoke(
    new Dictionary<string, object>
    {
        { MethodObject.NameOfName, "Artemus Gordon" },
        { MethodObject.NameOfRating, 0 }
    });

var returnValue = MethodObject.Invoke(
   new
   {
       Name = "Artemus Gordon",
       Rating = 0
   });

The practical difference is that in the first case, if you make a mistake in typing the dictionary key / parameter property name, Visual Studio will immediately notify you; but when defining an anonymous object, you can give the properties any names you like and Visual Studio cannot verify that they match anything in particular, or that they conform to any restrictions other than those imposed on property names by the C# language specification. Taking this and the relative amount of typing that one option requires in comparison to the other into account, a decision can be be made whether to support one usage or the other, or both. Thus, either one or both of the Invoke method overloads should be included, depending on your decision - or whether you elect to defer the decision, in which case I recommend supporting both variants in the interim.

Notice that (mainly for the sake of brevity) the above examples omit results-binding action parameters.

Doesn't providing both Invoke overloads and supporting both data transfer object types encourage inconsistency? Perhaps, but would it necessarily be a harmful inconsistency? And it's worth noting that code written to use one approach can be easily modified to use the other; for example, the process of modifying code that uses the dictionary to use the anonymous object instead involves mainly the use of the Delete and/or Backspace keys. So, "first draft" code can easily be written using the safer dictionary-based approach and "cleaned up" later to use the more readable anonymously-typed-object-based approach.

The MethodObjectHelper.Activate method performs run-time validations that will quickly, pro-actively and decisively inform the developer (or, depending on whether or how exceptions are handled, possibly application users) of any errors that were made in setting up and using the method object class - in general, any harmful deviation from the rules and procedures articulated in this article. It's too bad that Visual Studio and the compiler can't currently do this checking, but the comprehensive validations performed by the helper class are the next-best thing.

(Some readers may not know that dynamic objects in .NET are really just thinly-wrapped Dictionary<string, object> instances. That fact and some of its ramifications come to the fore when developing and testing code like this - which would be interesting to explain at length but, sadly, not directly relevant to this article. The DynamicHelper class provided with the downloaded code may be of some interest to readers.)

Earlier, I committed to pointing out where the defaulting of parameters to values that can't be assigned to the ParameterProperty attribute's DefaultValue property should be done. Just in case you missed it: There is an example of that above, where in both Invoke overloads the code arranges to default the Timestamp property to DateTime.Now.

Notice the very brief and private (local to the Invoke method) existence of the MethodObject instance. It is not intended, in normal situations, to be exposed to external code. In unusual situations, however, it might be necessary to return most or all of the object's properties as result data - in which case, the instance itself could simply be returned by the Invoke method. It's possible to control access to the object's features so that no methods and no property set accessors are publicly exposed, which is the convention that's followed in the examples in this article and the accompanying downloadable code, and that I tend to recommend. The object can thus behave strictly as a (read-only) data transfer object from the perspective, and for the purposes, of external code.

As a final note on this step, I will briefly discuss results-binding actions. They're just delegates that are intended to take the values of any ref / out parameter analogs (result properties), which are automatically packaged up for export by the helper class's Activate method after the method object's code has been invoked (that action to be shortly discussed below), and copy those values into variables in the calling context, after optionally doing any necessary / appropriate computing on them. Unlike in an actual method call, where this is done (minus optional computing) automatically, when using a MOH-assisted method object, a little bit of explicit code must be supplied to do the job. It's a small price to pay, and it isn't needed in all circumstances; in fact, like the use of ref and out parameters with method calls, it can be "religiously" avoided, if that's your preference.

Step Eight: The MethodObjectHelper class's Activate method is very powerful, but it requires you to make one change, at this point, to the implementation of your parameter and result properties. (Recall that we left those as simple as possible, earlier; it is now time to expand and complete the coding of the accessor bodies.) It is required that those properties use a dictionary in the MethodObjectHelper class as their backing store, and a couple of methods, GetValue and SetValue, are provided to facilitate this. Below is an example of the fully-fleshed-out code for one property - showing that arbitrary code can be included in the property accessors, as usual; but where ordinarily one might use a class field as a backing store, GetValue and SetValue are used instead.

[ParameterProperty(false, DefaultValue = 3)]
[ResultProperty]
public int Rating
{
    get { return (int)_methodObjectHelper.GetValue(NameOfRating); }

    protected set
    {
        var adjustedValue = Math.Min(Math.Max(value, 1), 5);
        _methodObjectHelper.SetValue(NameOfRating, adjustedValue);
    }
}

In the example, Rating is used as a ref method parameter analog, in that it is used for both input to and output from the method object, so it makes sense that it should have a set accessor. Method code (yet to be written / discussed) within a method object instance may, or will, need to modify whatever initial value Rating was given in its role as a parameter, or its default value, 3.

Set accessors need to be public or protected unless you want to seal the class, thus preventing inheritance. I think it is wisest to observe the convention of making them protected. This way, if an instance of the method object class is ever returned from an invocation - which is certainly possible - it will be read-only to external code.

If the method object class ends up with additional properties that are not parameter or result properties (aren't decorated with the ParameterProperty and/or ResultProperty attribute(s)), they may, but are not required to, use the MOH's dictionary as a backing store; using it is a requirement only for parameter and result properties.

Step Nine is the one in which we finally supply action to the software "machine" that we have so painstakingly constructed. This is the step in which the writing of the method code begins. I can't predict or prescribe what the method code will or should look like; it will necessarily vary widely and wildly from one implementation to the next. However, it is possible to state, or define, that there will be exactly one instance method that will be automatically invoked by the helper object's Activate method, and (as with a program's Main method) all other code within the method object that executes between the binding of imported values to parameter properties and the bundling of result property values for export must be run by that single automatically-invoked instance method.

An attribute is used to tell Activate which method to invoke. Here is an example:

[InvokeOnActivation]
protected void Compute()
{
    // ...
}

I have called the entry-point method Compute. (Obviously, if you choose to rename the method I have called Invoke to Compute, you'll have to pick a different name for this method.) The name is functionally irrelevant; it's the InvokeOnActivation attribute that enables Activate to identify and invoke the method.

Further restrictions are that the method must return void and take no parameters. This makes perfect sense, given that the parameter properties of the method object are effectively this method's parameters and that its main job is to compute and set the values of the result properties. The Invoke method can return the principal return value (stored in a result property named, according to convention, Result or ReturnValue), and if there are other exports, analogous to ref and out parameters, those are made available - along with the method object's principal return value - to the results-binding delegate that the calling code optionally passes to Invoke. Of course, it is also possible that no values will need to be returned, in which case the Invoke method should return void, provide no results-binding action parameter and pass none to the Activate method. (All of the variations available via a normal method call are easily and handily covered.)

For emphasis: It doesn't matter whether Invoke returns void or a value of some type, and it doesn't matter whether it has a required results-binding delegate parameter, an optional results-binding delegate parameter, or no such parameter; those details must be determined by the needs of the particular implementation.

Is there a Step Ten? Yes, there is. This is the step in which the original code gets carefully and lovingly refactored within the confines of the method object class. Although it requires very little explanation within this article, it is perhaps the most important step, the one that all of the other explanation and the related code serves to facilitate. The technique under discussion is all about freeing that code from problematic tight and complex coupling with its context so that it will be much easier to manipulate it into whatever form is best for the long-term benefit and well-being of the application.

At this step, you will also need to modify the original contextual code to invoke the method object instead of whatever it was doing before (calling a method with a large number of parameters?), of course.

Step Eleven requires you to test your code, debug it if necessary, and ultimately ensure that it works as you intend.

Conclusion

Here is a quick summary of the refactoring steps in outline form:

Construct the signature of a completely independent method; or if you're working with an existing method, modify its signature as needed to ensure its complete independence.
Construct a class having a property corresponding to each parameter of the method signature created in step 1 and the return value, if any. (By convention, name the property that represents the return value Result or ReturnValue.)
Decorate the properties with attributes.
- Properties corresponding to in parameters must be decorated with the ParameterProperty attribute.
  1. The required parameter is mustBeInitialized, which expresses whether the invoking code must provide an initial value for the parameter.
  2. The property DefaultValue should be set if mustbeInitialized = false was specified.
  3. If there is a need to use a default value that can't be assigned to DefaultValue, set DefaultValueSetInCode = true and plan to set the default value in code, later on.
- Properties corresponding to out parameters and the original method signature's return value (if any) must be decorated with the ResultProperty attribute.
- Properties corresponding to ref parameters must be decorated with both the ParameterProperty attribute and the ResultProperty attribute, as described above.
Declare and initialize a protected read-only field that contains a MethodObjectHelper instance.
Define an empty default constructor for the method object class that has protected access.
Provide the name of each property as a static readonly string. (See the article and example code for how-to recommendations.)
Provide one or two overload(s) of the public static Invoke method, following the examples and detailed discussion given above. The Invoke method (which you can rename if you prefer) must:
1. Ensure that the parameter-initializer parameter is not null. [Required.]
2. Take care of any necessary intra-code assignment of default values to parameter properties. [Optional.]
3. Create a method object instance and pass it, along with the parameter-initializer object and the results-binding action parameter, if used, to the method object helper instance's Activate method. [Required.]
4. Return the value of the Result / ReturnValue property, if appropriate. [Optional.]
Ensure that all parameter and result properties use the method object helper instance's internal dictionary as a backing store, accessing it via the GetValue and SetValue methods.
Create one protected void method that takes no parameters and decorate it with the InvokeOnActivation attribute. This method is the entry point for all computing to be done by the method object. (Consider naming this method Compute by convention.) The method must ensure that values are assigned to all result properties.
Refactor the original code to either be contained in, or be called by, the entry point method defined in the previous step. Modify contextual code to invoke the method object.
Test your code and ensure that it works correctly.

If you have a need to execute the "replace method with method object" refactoring strategy, please consider using the code and process introduced in this article as an aid to creating a successful outcome with relative ease, using a thoroughly standardized but very flexible approach.

Is the MOH-assisted approach to replacing methods with method objects a wondrous boon or a @#$% exercise in over-engineering? You decide.

License

This article, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)