This is another article about code readability. This time focusing more on how we should group members together, and also showing some cases of just bad rules that some places and people seem to enforce for no actual benefit.
Background
In the article, Code Style & Readability, I talked about code style focusing mostly on how to indent things.
This time, I will talk about a different area of code style, focusing mostly on how to group members together, but also discussing some other items, like naming rules.
Example
Let's start with the following code. Let's not talk about how useful this class is, if MinValue
and MaxValue
should be set in the constructor and be readonly
or the like. Let's just focus on the grouping of things:
public sealed class MinMaxContainer
{
public MinMaxContainer()
{
_value = 50;
_maxValue = 100;
}
private int _minValue;
public int MinValue
{
get => _minValue;
set
{
if (value > _value)
throw new ArgumentException("MinValue cannot be greater than Value.");
_minValue = value;
}
}
private int _maxValue;
public int MaxValue
{
get => _maxValue;
set
{
if (value < _value)
throw new ArgumentException("MaxValue cannot be less than Value.");
_maxValue = value;
}
}
private int _value;
public int Value
{
get => _value;
set
{
if (value < _minValue || value > _maxValue)
throw new ArgumentException("Value must be between MinValue and MaxValue, inclusive.");
_value = value;
}
}
}
As you can see, there are just three properties, named MinValue
, MaxValue
and Value
. The three properties use backing fields, named respectively _minValue
, _maxValue
and _value
, and those fields are declared just before the properties that manipulate them.
I consider this to be the most logical grouping for this class. Yet, if I use some style validation tools, I might be forced to change this class to look like this:
public sealed class MinMaxContainer
{
private int _minValue;
private int _maxValue;
private int _value;
public MinMaxContainer()
{
_value = 50;
_maxValue = 100;
}
public int MinValue
{
get => _minValue;
set
{
if (value > _value)
throw new ArgumentException("MinValue cannot be greater than Value.");
_minValue = value;
}
}
public int MaxValue
{
get => _maxValue;
set
{
if (value < _value)
throw new ArgumentException("MaxValue cannot be less than Value.");
_maxValue = value;
}
}
public int Value
{
get => _value;
set
{
if (value < _minValue || value > _maxValue)
throw new ArgumentException("Value must be between MinValue and MaxValue, inclusive.");
_value = value;
}
}
}
It is not that bad. I just had to put all the fields together at the top of the class declaration. The argument is that all fields should be grouped together, and they come first. Then the constructor (and destructor), then properties and, if we had, methods. I will leave events out of the discussion at this moment.
This kind of grouping also works perfectly for this simple class. People who use this grouping usually say that it is better to have all the fields on top, so we can easily know all the information the class holds without having to scan all the class. It is a solid argument at this moment.
Some Different Properties
So, what about adding another property. Let's say, for this class, I also want to have a Caption
. The idea is that on the screen, we will have a slider that will move between MinValue
and MaxValue
, and that slider will be presented with a text telling what it is for. Again, let's not focus how useful this class is... the new property, if added to the first block of code, could be:
public string Caption { get; set; }
Considering the first block of code had fields and properties grouped together, using an auto-property makes perfect sense when we don't need to do any processing on the get
and set
. Yet, to improve readability even more, I assume auto-properties must come before other properties... in particular, if there are many auto-properties, that makes reading the class easier, in my opinion.
What About the Alternative?
About the second piece of code, I wonder if:
- I can use an auto-property
- The auto-property should be grouped with the fields or properties
To explain, the idea is to have fields first, then the constructor/destructor, then properties. If I use an auto-property, that means I am declaring a property that can be used without validations, like a field, as well as creating an "invisible backing field".
So, if I keep this property grouped with the other properties, the argument that we can easily know all the information the class holds by just reading the top of the class is thrown out of the window.
Then, my thinking is that for those auto-properties, it is best to put them together with the fields, as they behave mostly as fields and also represent the data the type actually holds.
But, then, a style validation tool will probably complain that a property cannot come before the constructor.
So, one option is to move the property to be grouped together with the other properties. If I keep it as an auto-property, it works. The tool doesn't complain, but we lost the ability to know all the data that the class holds by just looking at the fields. So, should we try to fix that by:
- Adding an actual field to the class, grouped with the other fields:
private string _caption;
- Making the property use the field instead of being an auto-property:
public string Caption
{
get => _caption;
set => _caption = value;
}
That is more code to do the same. Yet, that's not the real problem, as I was never discussing amount of code written. The new problem is that now we will probably get a "suggestion" that the code could be refactored to use an auto-property. But the name suggestion is not right, as the code will not pass any validation to be able to be submitted/committed for production.
So, the answer is to go back to using an auto-property, together with the other properties. And the argument about having all the information together at the top of the class is just a broken argument (which people will probably keep using). Auto-properties will void that argument when used, but need to be used if the wrong style is enforced.
So, similar to what I presented in the previous article, my point is not about using auto-properties or not. Grouping fields with the properties that manipulate them or grouping all fields together are two valid styles. The most important thing is to be consistent with the argument on why to use them.
When fields are grouped together with the properties that manipulate them, we lose the ability to just scan the top of the class and see all the information it contains. On the other hand, if we decide to kill the property or copy it to a different file, we have a single place to look at, as both the field and the property are written near each other. So, if no special processing is needed for a property, we can go for an auto-property. It will just make perfect sense to do that.
When all fields are grouped together, before other members, it is not just about "fields need to be grouped with fields". It is about the idea that we have everything the objects of the class hold in a single place (so we can even tell how much memory those objects will consume). In fact, fields can all be grouped together at the top (most common for C#) or at the bottom (most common for C++). Yet, the important trait is: We know everything the objects will hold is going to be declared together. But then, we cannot use auto-properties grouped as properties, or we lose that trait. So, if this is the logic used, we should decide if we are going to group those auto-properties with the fields, or if we always need to declare a field for a property, and then implement the get
and set
of that property to use that field, even if no additional processing is needed.
But, guess what, the rule most places use (and enforce) is to:
- Group all fields together so we know all information the class contains by just looking at the fields
- Use auto-properties when possible
- Properties (be full properties or auto-properties) cannot be near the fields. They need to be grouped as properties.
And I really think the third item is just a huge mistake and defeats the purpose of the first item.
So, before just enforcing rules, I think we need to see how well they deal with corner cases. If they don't work for those corner cases, they should be just suggestions. Real ones, not suggestions that will forbid the code from being checked-in.
Is There More?
Of course there is. I purposely took events out on the previous discussion, but we should talk about them too.
We also have static
members. What is the right rule on how to group static
members?
What about the use of _
(the underscore) before field names? In fact, is it just for fields?
So, let's explore these topics:
Events
Since I programmed in Delphi, my understanding was that events were always declared last.
C# originally followed that logic. Although events can have the add
and remove
methods implemented, most of them are single-liners using the default compiler implementation. It was like that since C# appeared.
Properties, on the other hand, needed to have their get
and set
implemented in the first versions of C#. Now, properties and events look really alike. We can have auto-implemented properties as well as auto-implemented events, or we can implement the get
and set
of properties, and the add
and remove
of events, which will probably mean we will also have a backing field for each one of those pairs.
So, if properties and events are so similar it makes sense to group them together, right?
Well... that's an argument I keep hearing more and more lately. But let's look back at Delphi. There, the events are properties of a special type. Yet they are grouped differently.
To me, the grouping of properties and events was never about how similar or different their declarations are, but the purpose they fulfill.
Properties are the things we manipulate and are interested in reading. Events, on the other hand, cannot be read outside of the class declaring them (at least in C#), they are there to notify us about something that happened or is about to happen. I still think that, in most cases, properties should be grouped with other properties and events should be grouped with other events and that they should not be mixed together.
Yet, I will make exceptions when things deserve exceptions. For example, if I decide to add a ValueChanged
event in that class I used as example, and I am talking about an event to notify about the change of the Value
property alone, not to notify about all property changes, I will feel that grouping ValueChanged
with its property Value
will make perfect sense, at least when using the first kind of grouping, where I can group fields and properties together if they are related. In this case, grouping _value
, Value
and ValueChanged
together will make sense.
Anyway, just to let it clear, I will keep an event like PropertyChanged
grouped with other events at the end of the class definition, as such an event is not related to a single property.
Static Members
About the example code, did you notice that I initialized the Value
property with 50
and the MaxValue
with 100
(by setting the field values directly)? What if I decided to actually make those defaults configurable?
I could add static
members to the class, like DefaultMinValue
, DefaultValue
, DefaultMaxValue
and DefaultCaption
. Of course, this would also come with the backing fields where needed. The real point is, where should I put those Default members?
I know about two main lines of thought here:
- All
static
members first, then all instance members - It's always fields, then constructor/destructor, then properties (maybe with events), then methods and, if events aren't together with properties, then events. The
static
members come first inside each of their respective groups.
And I honestly think that constraining code in such a way all the time is just wrong and counter-productive. In many situations, I might create a static
method to help with the validation of an instance
property setter, for example. If such method is related to a single property, I will put such a method together with the property, ignoring both styles.
For other cases, I usually prefer all static
members first, then all instance
members. But, independently of my general preference, talking about the sample class, I am very inclined to say that grouping things like:
DefaultMinValue
with MinValue
DefaultValue
with Value
- and
DefaultMaxValue
with MaxValue
... makes the most sense. If I decide to put DefaultValue
and Value
before or after all other properties, it's not as important.
In this case, what matters is that everything related to Value
(that is, its backing field, its event, and its default value) is grouped together. The same applies for all other properties.
Most validation tools, though, will just assume that such a style is broken and will try to enforce a style where you cannot have all the related things together, because they want "kind of things" to be grouped together, rather than the purpose of those things. That is, they would either group static
with static
, or properties with properties, and would not allow static
fields, static
properties, fields and properties about the same thing together, before another group following the same pattern.
The Use of Underscore
It was 2008 when I had this discussion with some colleagues at work. As I used Delphi for a long time, I usually put "f" in front of all fields.
At that time, the fields for the sample class would be named fMinValue
, fMaxValue
and fValue
. Actually, in Delphi, it would be an upper-case F but, as in C# fields started in lower case, I ended-up using "f".
Some colleagues would either not use any prefix or would use p_
from private
. So, the code was somewhat a mess as we had fields prefixed with p_
, fields prefixed f
and fields with no prefix at all.
After discussing about having a single standard, some developers felt strongly that we shouldn't use f
, while others, including me, felt that p_
was not a good idea. In the end, we all agreed to use just _
for fields and private
members.
In fact, it turned out that we should just use _
for private
and internal
members (but not internal protected
, as those are visible by external assemblies) and, as fields are supposed to be always private
, that worked great.
One of the things that was really nice about having such a _
prefix is that we would never access a field thinking it was a local variable. As you probably know, two consecutive access to a field might suffer from threading issues if the value is modified by another thread. A local variable doesn't have that issue.
Notice that here I am talking about accidental uses of a field thinking it was a local. People who don't use any kind of prefixes for fields say that we should always use this.
when accessing fields. Yet, my point here is that when people don't use this.
they are probably assuming they are accessing a local variable, and maybe they will accidentally access a field (especially on large methods).
That doesn't happen when fields always have _
as their prefix. And we only write one extra character (the underscore) instead of having to write the five characters from this.
to access the field, also reducing the chances of needing multi-line statements when fixed line-lengths are enforced.
It's interesting that I found many different projects over the world also used the underscore, be it for private
members or just for fields.
Also, years later, I found that Python developers also use _
to say that a member is private
and not supposed to be used out of their class.
But, some more years later, I see that many people writing code in C# are getting rid of any prefix for fields and writing more verbose code that is also more prone to accidental usages.
In any case, not using any prefix for fields is OK when we enforce the use of this.
to access them. Notice that this rule is somewhat the opposite of what I was saying for the previous items. For groupings, I think that strongly enforcing a pattern is bad. For this case, I consider that not enforcing the use of this.
to access a field, when fields don't have a prefix, is just bad.
For example, from the sample class I provided, if we rename the field _value
to value
, we need to be sure we are correctly differentiating the field from the local, as the property Value
will need to assign value
(the local) to value
(the field).
Conclusion
My conclusion is that whatever style a company chooses to use, it should be really helping the readability of the code instead of being just "rules to be followed".
Strangely, some people seem to think that the more rules are enforced, the better the code would be, when many of those rules are actually hindering the development of the project, as well as making it less readable, which is exactly the opposite of what the rules were supposed to be doing.
So, if you are able to enforce or avoid those rules, think about the corner cases and how much the overall product might suffer if those cases are badly handled.
History
- 15th June 2022: Initial version