Introduction
In my limited knowledge, I am writing this article to clarify to the beginner certain essentials of the C# language. The abstract concept of a class in C# makes a distinction between built-in types and user-defined types. In C#, you create new types by using the class
keyword. Classes have behavior that is implemented by the class's member methods and classes have state, in which you model that state with member variables (sometimes called fields). You provide access to class' state through properties. In C#, you draw a distinction between a class, which is the definition of a new type, and an object, which is an instance of that class. For example, Dog
is a class (it describes the Dog
type) but Rover
, Fido
, and Spot
are all objects: they are instances of the class Dog
. Similarly, Button
is a class, but below, Save
, Cancel
, and Delete
are all objects, instances of class Button
.
Methods must return from their call before the next line of code can execute. For example:
using System;
public class Application {
public static void Main() {
Console.WriteLine("Before the method call.");
SomeMethod();
Console.WriteLine("After the method call.");
}
static void SomeMethod()
{
Int32 x = 4 * 16;
Console.WriteLine(x);
}
}
Results in:
Before the method call.
64
After the method call.
The point of this extremely simple example is that there the code contains three static methods. In C#, the Main
method is always static
. Static
methods are a compromise on the C concept of “global
” methods. C# has no global methods, but it is convenient at times to be able to invoke a method without having a particular instance of a class. When you call WriteLine()
, you invoke it not on an instance of Console, but on the Console
class itself. So, in order to clarify the difference between instance
methods and static
methods, consider another simple example of a user-defined class:
using System;
public class Employee {
private Int32 age;
static private Int32 numEmployees = 0;
public Employee(Int32 age)
{
this.age = age;
Employee.numEmployees++;
}
public static Int32 GetNumEmployees()
{
return numEmployees;
}
public void DisplayEmployeeInfo()
{
Console.WriteLine("This employee is {0} years old",age );
Console.WriteLine("{0} employees have been created",Employee.GetNumEmployees());
}
static void Main(string[] args)
{
Int32 age = 46;
Employee emp1 = new Employee(age);
emp1.DisplayEmployeeInfo();
Employee emp2 = new Employee(35);
emp2.DisplayEmployeeInfo();
Employee emp3 = new Employee(21);
emp3.DisplayEmployeeInfo();
}
}
OUTPUT
This employee is 46 years old
1 employees have been created
This employee is 35 years old
2 employees have been created
This employee is 21 years old
3 employees have been created
You instantiate an object, an instance of this class, using the new
operator. To call a non-static method, there must be an instance of the class. You instantiate an instance of the Employee
class, which we'll call emp1
, by using the new operator and writing new Employee()
. When you write 'new Employee()
', you get back a reference to that object that you hold in emp1
. The emp1
is a variable of data type Employee
. You can then call a non-static method, in our case DisplayInfo()
, by using the dot operator on that instance. You invoke instance methods on an object (an instance of the class) and you invoke static
methods on the class itself. In C#, static
methods are invoked without an instance. This means that instance methods are invoked on an instance. Static
methods are invoked on the class. As stated earlier, Main
, of course, is always static
, but there are other helper methods that might be static
when it is inconvenient to have an instance. For instance, there are conversion methods from the Conversion
class, such as Convert.ToInt32()
, which converts strings to integers (DWORD
s). You'd rather not have to instantiate an instance of the Convert
class just use its Convert.ToInt32()
method. Similar to static
methods are static
members, or fields. Static
members belong to the class, not an object. They are not used to represent the state of an object, but rather they are shared among all instances of a class. Typically this concept is used to keep track of the numbers of instances at any given time. You can access your static
members with static
method, which makes our static
members private
. Notice that the number of employees is tracked (by looking at the output). In the code, static private Int32 numEmployees
is initialized to zero and within the constructor Employee.numEmployees
is incremented (numEmployees++
). The beginner should also note that the constructor always has the same name as the class.
Look at the output again. That static
member keeps track of how many employees have been created. Since C# provides garbage collection, we can create objects as needed, and then the Garbage Collector disposes them of when they are no longer needed. The object is said to be out of reach, or out of scope. An important note is that this process of Finalization is non-deterministic. That is, you can't know when that object is declared to be Finalized to be destroyed. You determine that or force it. It is the CLR's Garbage Collector that calls the finalize()
method. You may need to create a finalize
method, but only when resources are scarce. But this involves resources that are not only scarce, like files in memory, but they are also unmanaged. If you are dealing with managed resources, you could call the Dispose()
method, but if you have opened the file, you should just close the file, and then the garbage collector dispose of it.
The “this” Keyword
Within a class, the 'this
' keyword refers to the current instance of the class. It is used to distinguish a member variable from a parameter. It is possible to have a member variable, say "age
", and a parameter to the constructor, age, and you differentiate from among them because the member variable will be this.age
. Consider a class, Employee
, with a constructor Employee
:
public Employee (Int32 age, string name)
{
this.age == age;
this name == name;
}
We are passing two parameters, (age of data type Int32
and name, of data type string
) to the constructor. So it turns out that the Employee
class has two member variables also called age
and name
. We can assign the value in the parameter age to the member variable age by writing this.age = age
. The this
keyword refers to the current object. Within a constructor, the this
keyword refers to the object being created. It is common to name private
member variables with the same name as the parameters passed to the constructor; in such a case you can use the this
keyword to differentiate the member variable from the parameter:
public class Employee {
private Int32 age;
private string name;
public Employee(Int32 age, string name)
{
this.age = age;
this.name = name;
}
}
In this example, this.age
refers to the member variable, while age
refers to the parameter. Similarly, this.name
refers to the member variable, while name
refers to the parameter. Now returning back to instance methods, instance methods are methods of an object, and typically impact the object itself. An Employee
class might declare a method to increment the salary
field:
public class Employee {
private Int32 salary; public PayManCash(Int32 increment)
{
salary += increment;
}
}
The Common Language Runtime executes code inside the bounds of a well-defined type system called the Common Type System (CTS). The CTS is part of the Common Language Infrastructure (CLI) specification and constitutes the interface between managed programs and the runtime itself. This means that C# is a strongly-typed language. This means that you must tell the compiler the types of the objects that you will be using. So if one were to try and pass an int32
as a string
, the compiler would catch it and inform you of that error. This actually makes compiler errors helpful because the bugs and errors are caught at compile time and not at runtime. Runtime errors are thus not a good thing. So while C# makes a distinction between user-defined types and built-in types, it also makes a distinction between value types and reference types. Value types are stored on the stack inline as a sequence of bytes, whereas reference types are stored on the heap. The stack is a chunk of memory that functions as an abstract data structure. The stack is an area of memory set aside for value types. The stack is of limited size, and values are added to the stack as needed. Values are “popped” off the stack as they are destroyed. The heap is an undifferentiated area of memory that is used for creating user-defined types (classes). When an object is created on the heap a reference is returned, and the built-in garbage collector cleans objects off the heap when there are no longer active references to the object... When you pass parameters to a method, they are always passed by value by default. More to the point, a copy is made and the copy is used in the method. So if you pass a value type, a copy of that object is passed. If you pass a reference type, a copy of the reference to the object is passed. This is an important distinction. Since by default the CLR assumes that all method parameters are passed by value, then when reference types objects are passed, the reference (or pointer) to the object is passed (by value) to the method. this means that the method can modify the object and the caller will see the change. For value type instances, again, a copy of the instance is passed to the method. This means that the method gets its own private copy of the value type and the caller and the instance in the caller isn't affected. Now recall the distinction between instance
methods and static
methods with the use of the constructor. Value type (instance
) constructors work differently from reference type (class
) constructors. So in addition to instance constructors, the CLR supports type constructors (also known as static
constructors, class
constructors, or type
initializers.
For example, assume you an object called Dog
that is passed by reference. The dog can be changed because there is a reference. If you pass an integer to a method, you cannot change the Int32
in the calling method because it is a copy. To illustrate this, examine the code shown below. We create a method to swap to integers. This method takes two parameters, left
and right
. It then creates a temporary variable, and assigns the value in the parameter left
to that temporary variable. It then assigns the value in the right
to left
, and the value in the temporary variable to the right
, thus swapping the values. The method displays the value in left
and right
both before and after the swap:
using System;
public class Program {
static void DoSwap(Int32 left, Int32 right)
{
Int32 temp;
Console.WriteLine("Swap before. left: {0}, right: {1}",left, right);
temp = left;
left = right;
right = temp;
Console.WriteLine("\nSwap after. left: {0}, right: {1}",left, right);
}
static void Main()
{
Int32 x = 5;
Int32 y = 7;
Console.WriteLine("Main before. x: {0}, y: {1}", x, y);
DoSwap(x, y);
Console.WriteLine("Main after. x: {0}, y: {1}", x, y);
}
}
Compiling this code yields:
Main before. x: 5, y: 7
Swap before. left: 5, right: 7
Swap after. left: 7, right: 5
Main after. x: 5, y: 7
You can see that the swap method shows that the values are swapped, but the values are not swapped back in Main
. This is consistent with the idea that the Int32
variables are passed by value. The Common Language Runtime allows you to pass parameters by reference instead of by value. In the C# language, you do this by using the out and ref keywords. Both keywords tell the compiler to emit metadata indicating that this designated parameter is passed by reference, and the compiler uses this to generate code to pass the address of the parameter other than the parameter itself. From the CLR’s perspective, out and ref are identical—that is, the same metadata and IL are produced regardless of which keyword you use. Seeing the issue in the code above, you’d like the Int32
to be passed by reference by adding the ref keyword both to the method and to the invocation of the method:
using System;
public class Program {
static void DoSwap( ref Int32 left, ref Int32 right)
{
Int32 temp;
Console.WriteLine("Swap before. left: {0}, right: {1}",left, right);
temp = left;
left = right;
right = temp;
Console.WriteLine("\nSwap after. left: {0}, right: {1}",left, right);
}
static void Main()
{
Int32 x = 5;
Int32 y = 7;
Console.WriteLine("Main before. x: {0}, y: {1}", x, y);
DoSwap( ref x, ref y);
Console.WriteLine("Main after. x: {0}, y: {1}", x, y);
}
}
Executing this code by adding the ref keyword makes all the difference: the values are changed back in Main
:
Main before. x: 5, y: 7
Swap before. left: 5, right: 7
Swap after. left: 7, right: 5
Main after. x: 7, y: 5
Properties
As mentioned earlier, classes have behavior and state. You model the behavior using methods and you model the state with member variables (sometimes called fields). You provide access to the class’ state through properties. Member variables ought to be private
. One of the standards behind object-oriented programming is data encapsulation. Data encapsulation means that your type's fields should never be publicly exposed because it is too easy to write code that improperly uses the fields, corrupting the object's state. This means that your fields be private
. Then, to allow a user of your type to get or set state information, you expose methods for that purpose. Methods that wrap access to a field are typically called accessor methods. A property could almost appear to the developer as a method, but to the user of your type a property could appear to have to direct access to the field, when they actually are prevented from altering the object's state. Property statements have two parts: a get
part for retrieving the value, and a set
part for setting the value. The set
part has an implicit parameter: value, which is the value being set. Consider this example:
public class Employee {
private Int32 age; public Int32 Age {
get
{
return age; }
set
{
age = value; }
}
}
Admittedly, properties complicate the definition of the type, but the disadvantages of having to write more code should counter the problems involved by not managing the object’s state. Below is an example meant to explain this further:
using System;
public class Employee
{
private int baseLevel;
private string name;
public Employee(string name, int baseLevel)
{
this.name = name;
this.baseLevel = baseLevel;
}
public int BaseLevel
{
get
{
return baseLevel;
}
set
{
baseLevel = value;
}
}
public string Name
{
get
{
return name;
}
set
{
name = value;
}
}
static void Main()
{
Employee fred = new Employee("Fred", 2);
Employee joe = new Employee("Joe", 5);
Console.WriteLine("{0}'s base: {1}",fred.Name,fred.BaseLevel);
joe.BaseLevel = 12;
Console.WriteLine("{0}'s base: {1}",joe.Name,joe.BaseLevel);
}
}
References
- CLR via C#, 2nd Edition, by Jeffrey Richter
- Professional .NET Framework 2.0, by Joe Duffy
- Notes by Jesse Liberty
History
- 31st October, 2009: Initial post