Introduction
Have you ever seen a function signature that includes a "ref
" parameter? I'm sure you have but how often have you encountered an instance of a class being passed to a function as a "ref
" parameter? Since a class by definition (in C#) is already a reference-type, this is a dead-giveaway that the engineer does not understand the difference between "value" and "reference" type.
Background
Back in the day, 'C' language introduced the concept of a "pointer". Reference types, in C#, work like pointers did in 'C'. So what is a pointer and how do they work? Well, using 'C' language, we used to write:
int value = 15;
int *p = &value;
In the statement above, pointer p
is assigned the "address-of" int
variable 'value
'. Pointers have an "l-value" and an "r-value". In this case, the l-value of p
is whatever the address of 'value
' is and the r-value is 15
. This allows us to pass p
to a function where (inside) we can "dereference" the r
-value as in:
int copy = *p;
This statement is read as "assign the r-value of p to copy". Imagine then, you have a function that takes as a parameter - an int
pointer such as this one:
void foo(int *pointer_to_value)
{
*pointer_to_value = 5;
}
Within the function foo
, we can assign a new r-value by de-rerencing the pointer. What happens? Well, now the original variable "value
" is changed to 5
.
Okay, so why do we need "ref
" parameters in C#? Well, the "ref
" keyboard is used to pass a parameter by reference (like a pointer) instead of by value. To illustrate, we could rewrite foo
as:
void foo(ref int pointer_to_value)
{
pointer_to_value = 5;
}
... and we can simply pass the "value
" variable (by reference) like this:
foo(ref value);
Now to drive home the point of this article, let's look at the code snippet I like to use when interviewing candidates for a developer position.
Using the Code
Here are three snippets of code:
namespace ValueVsRefType
{
class Example
{
internal class foo { internal int x { get; set; } internal int y { get; set; } }
static void Main(string[] args)
{
foo value = new foo();
Console.WriteLine("Before calling ChangeFoo(): foo.x = " +
value.x + " and foo.y = " + value.y);
ChangeFoo(value);
Console.WriteLine("After calling ChangeFoo(): foo.x = " +
value.x + " and foo.y = " + value.y);
}
static void ChangeFoo(foo parameter)
{
parameter.x = 10;
parameter.y = 20;
}
}
}
namespace ValueVsRefType
{
class Example
{
internal struct foo { internal int x { get; set; } internal int y { get; set; } }
static void Main(string[] args)
{
foo value = new foo();
Console.WriteLine("Before calling ChangeFoo(): foo.x = " +
value.x + " and foo.y = " + value.y);
ChangeFoo(value);
Console.WriteLine("After calling ChangeFoo(): foo.x = " +
value.x + " and foo.y = " + value.y);
}
static void ChangeFoo(foo parameter)
{
parameter.x = 10;
parameter.y = 20;
}
}
}
namespace ValueVsRefType
{
class Example
{
internal struct foo { internal int x { get; set; } internal int y { get; set; } }
static void Main(string[] args)
{
foo value = new foo();
Console.WriteLine("Before calling ChangeFoo(): foo.x = " +
value.x + " and foo.y = " + value.y);
ChangeFoo(ref value);
Console.WriteLine("After calling ChangeFoo(): foo.x = " +
value.x + " and foo.y = " + value.y);
}
static void ChangeFoo(ref foo parameter)
{
parameter.x = 10;
parameter.y = 20;
}
}
}
In the code snippets above, we have three nearly identical pieces of code. The only differences are:
First-Scenario: The construct "foo
" is a class (reference-type)
Second-Scenario: The construct "foo
" is a struct (value-type)
Third-Scenario: "foo
" is still a struct
but is passed by reference (using the "ref
" keyword)
What I hope you get from this article is this: passing a reference-type (class) is like passing a pointer and passing a value-type (struct
) is like passing a copy of a value (if you will) - in the former case, we change what the reference-type points to and in the latter case - we change a private copy of the same data.
One can certainly get into trouble passing reference-types with the 'ref
' keyword. Thanks to Astakhov Andrey for his valuable insight - note his little snippet of code:
namespace ValueVsRefType2
{
class Program
{
class Foo
{
public int x { get; set; }
public int y { get; set; }
}
static void Main(string[] args)
{
var victim = new Foo();
Operate(victim);
Console.WriteLine(victim.x);
Operate(ref victim);
try
{
Console.WriteLine(victim.x);
}
catch(NullReferenceException)
{
Console.WriteLine("attempt to use null Foo!");
}
}
static void Operate(Foo val)
{
val = null;
}
static void Operate(ref Foo val)
{
val = null;
}
}
}
My recommendation: only use the 'ref
' keyword when passing value-type parameters!
Points of Interest
All this is interesting but have you ever wondered what the mechanics are underneath the hood? Well, it is truly all about the mechanics of a stack. If you've ever used the (System.Collections
) Stack
class, you learned to "push" values (save operation) and "pop" values (restore operation). In compiled code, when a function is called, we first "push" all of the parameters on the stack. Later, inside the function, we reference the values within the program stack - and not within the confines of the heap. To illustrate this, based on the First-Scenario (above) when we push the "foo
instance" (value
), we simply push a pointer which is the address of "value
". This is very efficient because, depending on the architecture of the machine, only a few bytes (more appropriate a WORD
or DWORD
) are pushed on to the stack.
Consider the second scenario however - the "value
" type. In this code, we cause the actual values for foo.x
and foo.y
to be pushed on to the stack. Imagine if foo
had a large number of attributes - we would have to push lots of values on to the stack. Imagine also the case of recursion - where we call the same function one-to-many times. We have potential to push a tremendous number of values on the stack.
In the third scenario (value type by reference), it works the same as the first scenario: we pass only the address of "value
".
Hence, value types are passed and the compiler creates private local copies of data on the program stack within the scope of the called function. So the next time you have an instance of a class (say foo
) and pass it to a function, there's no need to use the "ref
" keyword - either in the call itself, or in the function signature you are calling. Hence, the "ref
" keyword is for passing value-type parameters by-reference instead of by-value..
Value-types in C#: All numeric data types, boolean, char, and date, structures, and enumerations
Reference-types in C#: string, arrays, classes, and delegates
History
This is the initial revision of this article.