This article is Part 1 of a 4 part series of a tutorial that aims to give a brief and advanced introduction into programming with C#.
These articles represent lecture notes, which have been given in the form of tutorials originally available on Tech.Pro.
Table of Contents
This tutorial aims to give a brief and advanced introduction into programming with C#. The prerequisites for understanding this tutorial are a working knowledge of programming, the C programming language and a little bit of basic mathematics. Some basic knowledge of C++ or Java could be helpful, but should not be required.
Introduction
This is the first part of a series of tutorials on C#. In this part, we are introducing the fundamental concepts of the language and its output, the Microsoft Intermediate Language (MSIL). We will take a look at object-oriented programming (OOP) and what C# does to make OOP as efficient as possible to realize in practice.
For further reading, a list of references will be given at the end. The references provide a deeper look at some of the topics discussed in this tutorial.
A little bit of warning before reading this tutorial: The way that this tutorial works is that it defines a path of usability. This part of the tutorial is essentially required before doing anything with C#. After this first tutorial, everyone should be able to write a simple C# program using basic object-oriented principles. Hence, more advanced or curious readers will probably miss some parts. As an example, we will not investigate exception handling, exciting possibilities with the .NET-Framework like LINQ and more advanced language features like generics or lambda expressions. Here, our aim is to give a soft introduction to C# for people coming from other languages.
The Right Development Environment
Before we can start doing something, we should think about where we want to do something. A logical choice for writing a program is a text editor. In case of C#, there is an even better option: using the Microsoft Visual Studio (VS)! The VS Community Edition is available for open-source projects without needing to purchase a license. However, there are more options, that have been designed specifically with C# in mind. If we want to have a cross-platform IDE, then MonoDevelop is a good choice. On Windows, a free alternative to VS and MonoDevelop is SharpDevelop. More generic options such as Visual Studio Code (VS Code), Sublime Text, or GitHub's Atom (to name a few) exist. All of them can be turned into powerful C# IDEs with OmniSharp.
Using a powerful IDE like Visual Studio will help us a lot when writing programs with the .NET-Framework. Features like IntelliSense (an intelligent version of auto-complete, which shows us the possible method calls, variables and available keywords for a certain code position), breakpoints (the program's execution can be paused and inspected at desired code positions) and graphical designers for UI development will help us a lot in programming efficiently.
Additionally, Visual Studio gives us integrated version control, an object browser and documentation tools. Another bonus for writing C# projects with Visual Studio is the concept of solutions and projects. In Visual Studio, one usually creates a solution for a development project. A solution file can contain several projects, which can be compiled to libraries (.dll) and executables (.exe files). The solution explorer enables us to efficiently manage even large scale projects.
The project files are used by the MSBuild application to compile all required files and link to the dependencies like libraries or the .NET-Framework. We, as a developer, do not need to worry anymore about writing makefiles. Instead, we just add files and references (dependencies) to projects, which will be compiled and linked in the right order automatically.
There are several shortcuts / functions that will make VS a real pleasure:
- CTRL (and) SPACE which forces IntelliSense to open
- CTRL (and) . which opens the menu if VS shows an option point (this will happen if a namespace is missing or if we rename a variable)
- F5 to build and start debugging
- CTRL (and) F5 to build and execute without debugging
- F6 just build
- F10 to jump to the next line within the current function (step over) when debugging
- F11 to jump to the line in the next or current function (step into) when debugging
- F12 to go to the definition of the identifier at the caret's position
- SHIFT (and) F12 to find all references of the identifier at the caret's position
- CTRL (and) + searches the whole project for a given symbol
Of course, those keyboard shortcuts can be changed and are not required. Everything that can be done with shortcuts is also accessible by using the mouse. On the other hand, there are more options and possibilities than shortcuts.
Where do get Visual Studio? There are several options, some of them even free. Students enrolled in a university who are participating in the DreamSparks / MSDNAA program usually have the option to download Visual Studio (up to Ultimate) for free. Otherwise, one can download public beta versions or the freely available Community Edition. Formerly, there have also been language-bound specialized versions of the Visual Studio, called Express Edition.
The freely available Community Edition is available on VisualStudio.com. For individual developers, Visual Studio Community places no restrictions on what kind of apps - may it be free or paid apps - to develop. For organizations, the license is restricted to use the product in the following scenarios:
- in a classroom learning environment,
- for academic research, or
- for contributing to open source projects.
In this tutorial, we will only focus on console applications. GUI will be introduced in the next tutorial. To create a new console application project in VS, we simple have to use the menu File: Then we select New, Project. In the dialog, we select C# on the left side and then Console application on the right side. Finally, we can give our project a name like SampleConsoleApp
. That's it! We already created our first C# application.
Basic Concepts
C# is a managed, static strong-typed language with a C like syntax and object-oriented features similar to Java. All in all, one can say that C# is very close to Java to start with. There are some really great features in the current version of C#, but in this first tutorial, we exclude them.
Managed means two things: first of all, we do not need to care about the memory anymore. This means that people coming from C or C++ can stop worrying about freeing the memory allocated for their objects. We only create objects and do something with them. Once we stopped using them, a smart program called the Garbage Collector (GC) will take care of them. The next figure shows how the Garbage Collector works approximately. It will detect unreferenced objects, collect them and free the corresponding memory. We do not have much control about the point in time when this is happening. Additionally, the GC will do some memory optimization, however, this is usually not done directly after freeing the memory.
This results in some overhead on the memory and performance side, but has the advantage that it is basically impossible to have segmentation faults in C#. However, memory leaks are still a problem if we keep references of objects that are no longer required.
The static strong-typed language means that the C# compiler needs to know the exact type of every variable and that the type-system must be coherent. There is no cast to a void
datatype, which lets us basically do anything, however, we have a data type called Object
on top, which might result in similar problems. We will discuss the consequences of the Object
base type later on. The strong part gives us a hint that operations may only be used if the operation is defined for the elements. There are no explicit casts happening without our knowledge.
We have another consequence of C# being managed: C# is not native, nor interpreted - it is something between. The compiler generates no assembly code, but the so called Microsoft Intermediate Language (MSIL). This trick saves us the re-compilations for different platforms. In case of C#, we just compile once and obtain a so called Common Language Runtime (CLR) assembly. This assembly will be Just-In-Time (JIT) compiled during runtime. Another feature is that optimizations will also take place during runtime. Often occurring method calls will be in-lined and not required statements will be omitted automatically.
In theory, this could result in (depending on the used references) platform-independent programs, however, this implies that all platforms have the requirements to start and JIT compile CLR assemblies. Right now Microsoft limits the .NET-Framework to the Windows family, however, Xamarin offers a product called Mono, which gives us a solution that also works on Linux and Mac.
Coming back to the language itself, we will see that the object-oriented features are inspired by those of Java. In C#, only a slightly different set of keywords has been used. The extend
keyword of Java has been replaced by the C++ operator colon :
. There are other areas where the colon has been used an operator with a different (but related) meaning.
Let's have a look at a sample Hello Tech.Pro code.
using System;
using System.Collections.Generic;
using System.Text;
using System.Threading.Tasks;
namespace HelloWorld
{
class Program
{
static void Main(string[] args)
{
Console.WriteLine("Hello Tech.Pro!");
}
}
}
This looks quite similar to Java, except the casing. Comments can be placed by using slashes (comment goes until the end of the line with no possibility to switch back before), or using a slash and an asterisk character. In the latter case, the comment has to be ended with the reverse, i.e., an asterisk and a slash. This kind of comment is called a block comment. Visual Studio also has a special mode of comments, that is triggered once three slashes are entered.
Back to the code: While we do not need to place our classes in a namespace, Visual Studio creates one by default. The creation of a class is, however, required. Every method and variable needs to be encapsulated. This is why C# is considered to be strongly-object-oriented. Data encapsulation is, as we will see, an important feature of object-oriented programming and therefore necessary in C#.
There are already some other things that we can learn from this small sample. First, C# uses (as it has to) a class called Console
for console based interaction. This class only contains so called static members like the WriteLine
function. A static member is such a member that can be accessed without creating an instance of the class, i.e., static members cannot be reproduced. They only exist once and exist without being explicitly created. Functions of a class are called methods.
The main entry point called Main
can only exist once, which is why it has to be static
. Another reason is that it has to live in a class. If it would not be static, the class would first have to be created (as all non-static members can only be accessed by created objects called instances of a class). Now we have a chicken-and-egg problem. How can we tell the program to build an instance of the class (such that the Main
method can be accessed), without having a method where we start doing something? Hence the requirement on the Main
method being static
.
Another thing we can learn is that the parameter args
is obviously an array. It seems that an array is used as a datatype, whereas in C the variable is the array and the type would be string. This is by design and has some useful implications. We will later see that every array is based on an object-oriented principle called inheritance, which specifies a basic set of fields and implementations. Every array type has a field called Length
, which contains the number of elements in the array. This is no longer hidden and needs to be passed as an additional parameter, which is why the standard Main
method of a C# program only has 1 parameter compared to the standard C main
method with 2 parameters.
Namespaces
Namespaces might be a new concept for people coming from the C programming language. Namespaces try to bring order to the world of types. Even though C# allows us to have multiple methods with the same name (called method overloading), but a unique signature in terms of parameter types. The full signature consisting of return type with input types and parameter names is therefore not considered to circumvent potential ambiguities when we have, e.g., functions with the same input types, but different output types. In this scenario, there is no way for the compiler to distinguish between the different overloads.
This restriction could result in serious problems. If we consider the case of using (requiring) two (independent) internal libraries, it is possible that both libraries define a type with the same name. Now there would be no way to use both types (if compiling would be possible at all!). At best, we could use one type.
This is where namespaces come to rescue. A namespace is like a container where we can place types in. However, that container is only a string that will be used by the compiler to distinguish between types. Even though the dot (like in a.b
) is usually used to create the impression of an relation between two strings (in this case a
and b
), there is no restriction that a namespace has to exist before creating another one it it, e.g., a
has not to be defined or used somewhere before using a.b
.
Usually, we would have to place that namespace in front of every type always, however, using C#'s using
keyword we tell the compiler to implicitly do that for all types of the used namespace. Types of the current namespace are also implicitly used by the compiler. There is only one exception i.e., if we want to use a type that is defined in multiple used namespaces. In that scenario, we always have to explicitly specify the namespace of the type we want to use.
Data Types and Operators
Before we can actually start doing something, we need to introduce the basic set of datatypes and operators. As already mentioned, C# is a static strongly-typed language, which is why we need to care about data types. In the end, we will have to specify what's the type behind any variable.
There is a set of elementary data types, which contains the following types:
bool
(1 byte) used for logical expressions (true
, false
) char
(2 bytes) used for a single (unicode) character short
(2 bytes) used for storing short integers int
(4 bytes) used for computations with integers long
(8 bytes) used for computations with long integers float
(4 bytes) used for computations with single precision floating point numbers double
(8 bytes) used for computations with double precision floating point numbers decimal
(16 bytes) used for computations with fixed precision floating point numbers
There are also modified data types like unsigned integer (uint
), unsigned long (ulong
) and many more available. Additionally, a really well working class for using string
s has been delivered. The class is just called string
and works as one would expect, except that a lot of useful helpers are directly available.
Types alone are quite boring, since we can only instantiate them, i.e., create objects based on their definition. It gets more interesting once we connect them by using operators. In C#, we have the same set of operators (and more) as in C:
- Logical operators like
==
, !=
, <
, <=
, >
, >=
, !
, &&
, ||
- Bit operators like
^
, &
, |
, <<
, >>
- Arithmetic operators like
+
, -
, *
, /
, %
, ++
, --
- The assignment operator
=
and combinations of the assignment operator with the binary operators - The ternary operator
? :
(inline condition) to return either the left or right side of the colon depending on a condition specified before the question mark - Brackets
()
to change the operator hierarchy
Additionally, C# has some inbuilt-methods (defined as unary operators) like typeof
or sizeof
and a set of really useful type operators:
- The standard cast operator
()
as in C - The reference cast operator
as
- The type conformance checker
is
- The null-coalescing operator
??
- The inheritance operator
:
Let's see some of those types and operators in action:
using System;
class Program
{
static void Main(string[] args)
{
int a = 5;
double x = 2.5 + 3.7;
float y = 3.1f;
string someLiteral = "This is a string!";
string input = Console.ReadLine();
a = int.Parse(input);
Console.WriteLine("a % 10 = " + a % 10);
}
}
We will see more operators and basic types in action throughout this series of tutorials. It should be noted that all operators will by default return a new type, leaving the original variable unchanged. Another important aspect is there is a fixed operator hierarchy, which is basically like the operator hierarchy in C. We do not need to learn this, since we can always use brackets to change the default hierarchy. Additionally, the operator hierarchy is quite natural in following rules like dot before dash, i.e., multiplication and division before addition and subtraction.
Reference and Value Types
A very important concept for understanding C# is the difference between reference and value types. Once we use a class
, we are using a reference type. On the other side, any object that is a struct
will be handled as a value type. The important difference is that reference types will be passed by reference, i.e., we will only pass a copy of the position of the object, and not a copy of the object itself. Any modification on the object will result in a modification on the original object.
The other case is the one with value types. If we give a method an argument that is a struct
(that includes integers, floating point numbers, a boolean, a character and more), we will get a copy of the passed object. This means that modifications on the argument will never result in a modification on the original object.
The whole concept is quite similar to the concept of pointers in C. The only difference is that we do not actually have access to the address and we do not have to dereference the variable to get access to the values behind it. One could say that C# handles some of the things automatically that we would have to write by hand in C/C++.
In C#, we have access to two keywords that can be passed together with the argument definition. One important keyword is called ref
. This allows us to access the original value in the case of passing in structures. For classes, the consequence is that the original pointer is actually passed in, allowing us to change the position of it. This sounds quite strange at first, but we will see that in most cases there is no difference between an explicit ref
on a class instance to an implicit by just passing the class. The only difference is that we can actually reset the pointer, like in the following code:
using System;
class Program
{
static void Main()
{
string s = "Hi there";
Console.WriteLine(s);
ChangeString(s);
Console.WriteLine(s);
ChangeString(ref s);
Console.WriteLine(s);
}
static void ChangeString(string str)
{
str = null;
}
static void ChangeString(ref string str)
{
str = null;
}
}
The other important keyword is called out
. Basically, out
is like ref
. There are a few differences that are worth mentioning:
out
variables need to be assigned in the method that has them as parameters. out
variables do not need to be assigned in the method that passes them as parameters. out
variables mark the variable as being used for (additional) outgoing information.
The main usage of out
parameters is as the name suggests: We now have an option to distinguish between "just" references and parameters, which will actually return something. The .NET-Framework uses out
parameters in some scenarios. Some of those scenarios are the TryParse
methods found on the basic structures like int
, double
and others. Here a bool
is returned, giving us an indicator if the given string
can be converted. In case of a successful conversion, the variable given in form of an out
parameter will be set to the corresponding value.
All the talk about reference (class
) and value (struct
) types is useless if we do not know how to create such types. Let's have a look at a simple example:
using System;
class Program
{
static void Main(string[] args)
{
SampleClass sampleClass;
SampleStruct sampleStruct;
HaveALook(out sampleClass);
HaveALook(out sampleStruct);
}
static void HaveALook(out SampleClass c)
{
c = new SampleClass();
}
static void HaveALook(out SampleStruct s)
{
s = new SampleStruct();
}
class SampleClass
{
}
struct SampleStruct
{
}
}
In the example above, we are creating two types called SampleClass
and SampleStruct
. We can instantiate new objects from those type definitions using the new
keyword. This is not new (pun intended) for programmers coming from Java or C++, but certainly something new for programmers coming from C. In C, we would use the malloc
function in case of a class (giving us a pointer) and nothing with a structure (giving us a value). There is, however, one big advantage of using that new
keyword: It will not only do the right memory allocation (on the heap in case of a reference type, or the stack in case of a value type), but also call the corresponding constructor. We will later see what a constructor is, and what kind of benefits it gives us.
Coming back to our example, we see that we do not instantiate anything with the new
keyword in the Main
method, however, we do use it in the methods called HaveALook
, which differ by the parameter type they expect. Using breakpoints in those methods, we can see that the class variable is actually NOT set (the passed in value of the variable is null
, which is the constant for a pointer that is not set), while the structure has already some value.
Control Flow
Now that we introduced the basic concepts behind C#, as well as elementary data types and available operators, we just need one more thing before we can actually go ahead and write actual C# programs. We need to know how to control the program flow, i.e., how to introduce conditions and loops.
This is pretty much the same as in C, as we are dealing with a C-style syntax. That being said, we have to follow these rules:
- Conditions can be introduced by using
if
or switch
. - A loop is possible by using
for
, while
, do
-while
. - A loop will be stopped by using the
break
keyword (will stop the most inner loop). - A loop can skip the rest and return to the condition with the
continue
keyword. - C# also has an iterator loop called
foreach
. - Another possibility is the infamous
goto
statement - we will not discuss this. - There are other ways of controlling the program flow, but we will introduce those later.
The next program code will introduce a few of the mentioned possibilities:
using System;
class Program
{
static void Main(string[] args)
{
string input = Console.ReadLine();
if (input == "")
{
Console.WriteLine("The input is empty!");
}
else
{
for (int i = 0; i < input.Length; i++)
{
switch (input[i])
{
case 'a':
Console.Write("An a - and not ... ");
goto case 'z';
case 'z':
Console.WriteLine("A z!");
break;
default:
Console.WriteLine("Whatever ...");
break;
}
}
}
}
}
The iterator loop called foreach
is not available in C. It is possible to use foreach
with every type that defines some kind of iterator. We will see what this means later on. For now, we only have to know that every array already defines such an iterator. The following code snippet will use the foreach
-loop to output each element of an array.
int[] myints = new int[4];
myints[0] = 2;
myints[1] = 3;
myints[2] = 17;
myints[3] = 24;
foreach(int myint in myints)
{
Console.Write("The element is given by ");
Console.WriteLine(myint);
}
There are some restrictions on foreach
as compared to for
. First of all, it is not as efficient as a for
-loop, taking one more operation to start the loop, and calling always the iterators Next
method at the end of each iteration. Second, we cannot change the current element. This is due to the fact that foreach
operates on iterators, which are in general immutable, i.e., a single element cannot be changed. This is also required to keep the iteration consistent.
Object-Oriented Programming
Object-oriented programming is a method of focusing around objects instead of functions. Therefore, the declaration of types is a key aspect of object-oriented programming. Everything has to be part of a type, even if it is just static without any instance dependency.
There are downsides of this pattern of course. Instead of writing sin()
, cos()
, sign()
, etc. we have to write Math.Sin()
, Math.Cos()
and Math.Sign()
since the (very helpful) math functions need to be inside a type (in this case the class Math
) as well.
So what are the key aspects of object-oriented programming?
- Data encapsulation
- Inheritance
- Relations between types
- Declaring dependencies
- Maintainability
- Readability
By creating classes to carry large, reusable packages of data we provide encapsulation. The inheritance process helps us mark a strong relation between types and reuse the same basic structure. Encapsulating functions in types will group what belongs together and reduce code by omitting required parameters. Also, misusage will be prevented by default. All in all, the main goal is too reduce maintenance efforts by improving readability and increasing the compiler's power in error detection.
The main concept for OOP is the type-focus. The central type is certainly a class. Structures are also important, but will only be used in edge cases. Structures make sense if we have only a small payload, or want to create quite elementary small types that will have stronger immutable features than classes.
Let's have a look again at how we create a class (the type) and how we create class objects (instances):
class MyClass
{
public void Write(string name)
{
Console.WriteLine("Hi there... {0}!", name);
}
}
MyClass instance = new MyClass();
A class makes sense once we want to reuse a set of methods with a fixed set of variables additionally to some parameters for those methods. A class is also very useful once we want to use an already existing set of variables and / or methods. If we just want a collection of functions that is unrelated to any set of fixed (called instance dependent) variables, then we create static
classes where we can just insert static
methods and variables. Good examples of such static
classes are the Console
and Math
class. They cannot be instantiated (instances of static
classes, i.e., classes that do not contain instance dependent code, do not make any sense) and provide only functions with a set of parameters.
Inheritance and Polymorphism
Now we are coming to the inheritance issue. To simplify things, we can think of inheritance as a recursive copy paste process by the compiler. All members of the parent (base
) class will be copied.
class MySubClass : MyClass
{
public void WriteMore()
{
Console.WriteLine("Hi again!");
}
}
As already mentioned, the inheritance operator is the :
. In this example, we create a new type called MySubClass
, which inherits from MyClass
. MyClass
has been defined in the previous section and does not define an explicit inheritance. Therefore MyClass
inherits from Object
. Object
itself just defines four methods, that are:
ToString
, which is a very comfortable way of defining how an instance of the type should presented as a string Equals
, which is a generic way of comparing two arbitrary objects of equality GetHashCode
, which gets a numeric indicator if two objects could be equal GetType
, which gets the meta-information on the specific type of the current instance
These four methods are available at MyClass
and MySubClass
instances (copy paste!). Additionally, MyClass
defines a method called Write
, which will also be available for all MySubClass
instances. Finally, MySubClass
defines a method called WriteMore
, which will only be available for MySubClass
instances.
Right now the inheritance concept is already a little bit useful, but it is not very powerful. The concept of polymorphism will enable us to specialize objects using inheritance. First we will introduce the virtual
keyword. This keyword lets us specify that a (virtual
marked) method can be re-implemented by more specialized (or derived) classes.
class MyClass
{
public virtual void Write(string name)
{
Console.WriteLine("Hi {0} from MyClass!", name);
}
}
If we now want to re-implement the Write
method in the MySubClass
class, then we have to do that explicitly by marking the re-implementation as override
. Let's have a look:
class MySubClass : MyClass
{
public override void Write(string name)
{
Console.WriteLine("Hi {0} from MySubClass!", name);
}
}
What is the great benefit of this? Let's check out some example code snippet:
MyClass a = new MyClass();
MyClass b = new MySubClass();
a.Write("Flo");
b.Write("Flo");
So the trick is that without knowing about the more specialized instance behind it, we are able to access to specialized implementation available in mySubClass
. This is called polymorphism and basically states that classes can re-implement certain methods, which can then be used again without knowing about the specialization or re-implementation at all.
Already here, we can benefit from polymorphism, since we are able to override
the four methods given by Object
. Let's consider the following example:
using System;
class Program
{
static void Main(string[] args)
{
MyClassOne one = new MyClassOne();
MyClassTwo two = new MyClassTwo();
Console.WriteLine(one);
Console.WriteLine(two);
}
}
class MyClassOne
{
}
class MyClassTwo
{
public override string ToString()
{
return "This is my own class output";
}
}
Here, the method WriteLine
solves the problem of having to display any input as a sequence of characters by using the ToString
method of Object
. This enables WriteLine
to output any object, even objects that are unknown. Everything that WriteLine
cares about is that the given argument is actually an instance of Object
(that applies to every object in C#), which means that the argument has a ToString
method. Finally the specific ToString
method of the argument is called.
Access Modifiers
Access modifiers play an important rule in forcing programmers to apply to a given object-oriented design. They hide members to prevent undefined access, define which members take part in the inheritance process and what objects are visible outside of a library.
Right here, we already have to note that all restrictions placed by modifiers are only artificial. The compiler is the only protector of those rules. This means that those rules will not prevent unauthorized access to, e.g., a variable during runtime. Therefore setting access modifiers to spawn some kind of security system is certainly a really bad idea. The main idea behind those modifiers is the same as with object-oriented programming: Creating classes that encapsulate data and force other programmers in a certain pattern of access. This way, finding the right way of using certain objects should be simpler and more straight forward.
C# knows a whole bunch of such modifier keywords. Let's have a look at them with a short description:
private
, declares that a member is neither visible from outside the object, nor does it take part in the inheritance process protected
, declares that a member is not visible from outside the object, however, the member takes part in the inheritance process internal
, declares that a member or type is visible outside the object, but not outside the current library internal
protected, has the meaning of internal
OR protected
public
, declares that a member or type is visible everywhere
Most of the time, we can specify the modifier (there are some exceptions to this rule, as we will see later), however, we can also always omit it. For types directly placed in a namespace, the default modifier is internal
. This makes quite some sense. For types and members placed in a type (like a class or structure), the default modifier is private
.
This makes sense since it is just the same choice in C++ classes: However, for a struct
in C++, we always should have started with a private
declaration, otherwise every member would have been public
. This was the logical choice to ensure compatibility with C. In this regard, C# is much more coherent (and predictable) using always the strongest access modifier independent of the struct
or class
choice (i.e., in C# nothing is public
unless stated so).
using System;
class Program
{
static void Main()
{
MyClass c = new MyClass();
int num = c.WhatNumber();
}
}
public class MyClass
{
private int a;
protected int b;
int RightNumber()
{
return a;
}
public int WhatNumber()
{
return RightNumber();
}
}
internal class MySubClass : MyClass
{
int AnotherRightNumber()
{
b = 8;
return a;
}
}
There are some restrictions that will be enforced by the compiler. The reverse case of the example above, where we set MyClass
internal
and MySubClass
public
is not possible. The compiler detects, that having MySubClass
visible to the outside must require MyClass
to also be visible to the outside. Otherwise, we have a specialization of a type where the basic type is unknown.
The same is true in general, like when we return an instance of a type that is internal
in a method that is visible to the outside (public
with the type being public
). In this case, the compiler will also tell us that the type that is returned has a stronger access modifier set.
In C#, every non-static method has access to the class instance pointer variable this
. This variable is treated like a keyword and points to the current class instance. Usually, the keyword can be omitted before calling methods of the class instance, however, there are multiple scenarios where the this
is very useful.
One of those scenarios is to distinguish between local and global variables. Consider the following example:
class MyClass
{
string str;
public void Change(string str)
{
this.str = str;
}
}
Since methods marked as static
are independent of instances, we cannot use the this
keyword. Additionally to the this
pointer, there is also a base
pointer, which gives us access to all (for the derived class accessible) members of the base class instance. This way, it is possible to call already re-implemented or hidden methods.
class MySubClass : MyClass
{
public override void Write(string name)
{
base.Write(name);
Console.WriteLine("Hi {0} from MySubClass!", name);
}
}
In the example, we are accessing the original implementation of the Write
method from the re-implementation.
Properties
People coming from C++ will know the problem of restricting access to variables of a class. In general, one should never expose variables of a class, such that other classes could change it without the class being notified. Therefore, the following piece of code was written quite often in C++ (code given in C#):
private int myVariable;
public int GetMyVariable()
{
return myVariable;
}
public void SetMyVariable(int value)
{
myVariable = value;
}
This is a clean code and we (as developers) now have the possibility to react to variable's external changes by inserting some lines of code before myVariable = value
. The problem with this code is that:
- we really only want to show that this is just a wrapper around
myVariable
and that - we need to write too much code for this simple pattern.
Therefore, the C# team introduced a new language feature called properties. Using properties, the code above boils down to:
private int myVariable;
public int MyVariable
{
get { return myVariable; }
set { myVariable = value; }
}
This looks much cleaner now. Also, the access changed. Before we accessed myVariable
like a method (using a = GetMyVariable()
or SetMyVariable(b)
), but now we access myVariable
like a variable (using a = MyVariable
or MyVariable = b
). This is more like the programmer's original intention and saves us some lines of code.
Internally, the compiler will still create those (get
/ set
) methods, but we do not care about this. We will just use properties with either a get
block, a set
block, or both, and everything will work.
Please note that Microsoft Visual C++ has also an extension in form of the __declspec
keyword to give us the ability to write properties in C++ similar to the C# properties.
The Constructor
The constructor is a special kind of method that can only be called implicitly and never explicitly. A constructor is automatically called when we allocate memory with the new
keyword. In perfect alignment with standard methods, we can overload the constructor by having multiple definitions that differ by their parameters.
Every class (and structure) has at least one constructor. If we did not write one (until now we did not), then the compiler places a standard (no parameters, empty body) constructor. Once we define one constructor, the compiler does not insert a default constructor.
The signature of a constructor is special. It has no return type, since it does implicitly return the new instance, i.e., an instance of the class. Also a constructor is defined by its name, which is the same name as the class. Let's have a look at some constructors:
class MyClass
{
public MyClass()
{
}
public MyClass(int a)
{
}
public MyClass(int a, string b)
{
}
public MyClass(int a, int b)
{
}
}
This looks quite straight forward. In short, a constructor is a method with the name of the class that specifies no return value. Using the various constructors is possible when instantiating an object of the class.
MyClass a = new MyClass();
MyClass b = new MyClass(2);
MyClass c = new MyClass(2, "a");
MyClass d = new MyClass(2, 3);
Of course, it could be that one constructor would need to do the same work as another constructor. In this case, it seems like we only have two options:
- Copy & paste the content.
- Extract the content into a method, which is then called by both constructors.
The first is a simple no-go after the DRY (Don't Repeat Yourself) principle. The second one is maybe also not fine, since this could result in the method being abused on other locations. Therefore C# introduces the concept of chaining constructors. Before we actually execute instructions from one constructor, we call another constructor. The syntax relies on the colon :
and the current class instance pointer this
:
class MyClass
{
public MyClass()
: this(1, 1)
{
}
public MyClass(int a, int b)
{
}
}
Here, the default constructor uses the constructor with two parameters to do some initialization work. The initialization work is the most popular use-case of a constructor. A constructor should be a lightweight method that does some preprocessing / setup / variable initialization.
The colon operator for the constructor chaining is used for a reason. Like with inheritance, every constructor has to call another constructor. If no call is specified (i.e., no previous constructor), then the called constructor is the default constructor of the base class. Therefore, the second constructor in the previous example does actually look like the following:
public MyClass(int a, int b)
: base()
{
}
The additional line is, however, redundant, since the compiler will automatically insert this. There are only two cases where we have to specify the base constructor for the constructor chaining:
- when we actually want to call another base constructor than the default constructor of the base class, and
- when there is no default constructor of the base class.
The reason for the constructor chaining with the base class constructor is illustrated in the next figure:
We see that in this class hierarchy in order to create an instance of Porsche
, an instance of Car
has to be created. This creation, however, requires the creation of an instance of a Vehicle
, which requires the instantiation of Object
. Each instantiation is associated with calling a constructor, which has to be specified. The C# compiler will automatically call the empty constructor, but this is only possible in case such a constructor exists. Otherwise, we have to tell the compiler explicitly what to call.
There are also cases where other access modifiers for a constructor might make sense. If we want to prevent instantiation of a certain type (like with abstract
), we could create one default constructor and make it protected
. On the other hand, the following is a simple so-called Singleton pattern:
class MyClass
{
private static MyClass instance;
private MyClass() { }
public static MyClass Instance
{
get
{
if (instance == null)
instance = new MyClass();
return instance;
}
}
}
Now we cannot create instances of the class, but we can access the static
property Instance
by using MyClass.Instance
. This property not only has access to the static
variable instance
, but also has access to all private
members like the private
constructor. Therefore, it can create an instance and return the created instance.
This implementation has two main advantages:
- Because the instance is created inside the
Instance
property method, the class can exercise additional functionality (for example, instantiating a subclass), even though it may introduce unwelcome dependencies. - The instantiation is not performed until an object asks for an instance. This approach is referred to as lazy instantiation. This avoids instantiating unnecessary singletons when the application starts.
We will not discuss other design patterns in this series of tutorials.
Abstract Classes and Interfaces
There is one more thing we need to discuss in this tutorial. Sometimes, we want to create classes that should just be a sketch for some more specialized implementations. This is like creating a template for classes. We do not want to use the template directly (instantiate it), but we want to derive from the class (use the template), which should save us some time. The keyword for marking a class as being a template is abstract
. Abstract classes cannot be instantiated, but can be used as types of course. Such a class can also mark members as being abstract
. This will require derived classes to deliver the implementation:
abstract class MyClass
{
public abstract void Write(string name);
}
class MySubClass : MyClass
{
public override void Write(string name)
{
Console.WriteLine("Hi {0} from an implementation of Write!", name);
}
}
Here, we mark the Write
method as being abstract, which has two consequences:
- There is no method body (the curly brackets are missing) in the first method definition.
MySubClass
is required to override
the Write
method (or in general: all methods that are marked abstract
and not implemented yet).
Also the following code will fail, since we create an instance of MyClass
, which is now marked as being abstract
.
MyClass a = new MyClass();
MyClass b = new MySubClass();
An important restriction in doing OOP with C# is the limitation to inheritance from one class only. If we do not specify a base class, then Object
will be used implicitly, otherwise the explicitly specified class will be used. The restriction to one class in the inheritance process makes sense, since it keeps everything well-defined and prohibits weird edge cases. There is an elegant way around this limitation, which builds upon using so called interface
types.
An interface
is like a code-contract. Interfaces define which functionalities should be provided by the classes or structures that implement them, but they do not say any word about how the exact function looks like. That being said, we can think of those interfaces as abstract classes without variables and with only abstract
members (methods, properties, ...).
Let's define a very simple interface:
interface MyInterface
{
void DoSomething();
string GetSomething(int number);
}
The defined interface contains two methods called DoSomething
and GetSomething
. The definitions of these methods look very similar to the definitions of abstract methods, except we are missing the keywords public
and abstract
. This is by design. The idea is that since every member of an interface is abstract
(or to be more precise: misses an implementation), the keyword is redundant. Another feature is that every method is automatically being treated as public
.
Implementing an interface is possible by using the same syntax as with classes. Let's consider two examples:
class MyOtherClass : MyInterface
{
public void DoSomething()
{ }
public string GetSomething(int number)
{
return number.ToString();
}
}
class MySubSubClass : MySubClass, MyInterface
{
public void DoSomething()
{ }
public string GetSomething(int number)
{
return number.ToString();
}
}
This snippet should demonstrate a few things:
- It is possible to implement only one interface and no class (this will result in inheriting directly from
Object
) - We can also implement one or more interfaces and additionally a class (explicit inheritance)
- We always have to implement all methods of the "inherited" interface(s)
- Also as a side note, we do not need to re-implement
Write
method on the MySubSubClass
, since MySubClass
already implements this
It should be clear that we cannot instantiate interfaces (they are like abstract classes), but we can use them as types. Therefore, it would be possible to do the following:
MyInterface myif = new MySubSubClass();
Usually interface types start with a big I in the .NET-Framework. This is a useful convention to recognize interfaces immediately. In our journey, we will discover some useful interfaces that are quite important for the .NET-Framework. Some of these interfaces are used by C# implicitly.
Interfaces also gives us another option for implementing their methods. Since we can implement multiple interfaces, it is possible that two methods with the same name and signature will be included. In this case, there must be a way to distinguish between the different implementations. This is possible by a so-called explicit implementation. An explicitly implemented interface will not contribute to the class directly. Instead, one has to cast the class to the specific interface type in order to access the members of the interface.
Here is an explicit implementation:
class MySubSubClass : MySubClass, MyInterface
{
void MyInterface.DoSomething()
{ }
string MyInterface.GetSomething(int number)
{
return number.ToString();
}
}
Explicit and implicit implementations of definitions from an interface can be mixed. Hence, we can only be sure to get access to all members defined by an interface, if we cast an instance to that interface.
Exception Handling
There are many things that have been designed with OOP in mind in C#. One of those things is exception handling. Every exception has to derive from the Exception
class, which has been placed in the System
namespace. In general, we should always try to avoid exceptions, however, there are cases where an exception could easily happen. One such example is found in communication with the file system. Here, we are talking to the OS, which sometimes has no other choice than to throw an exception. There could be various reasons, e.g.:
- The given path is invalid.
- The file cannot be found.
- We do not have sufficient rights to access the files.
- The file is corrupt and cannot be read.
Of course, the OS API could just return a pseudo file or pseudo content and everything would work. The problem with such a handling is that this does not represent reality, and we would have no way to detect that obviously something went wrong. Another option would be to return an error code, but this would result in a C like API and it would leave the handling to the programmer. If the programmer now would do a bad job (like ignoring the returned error code), the user would never see that something went wrong.
Here is where exceptions come into play. The important thing about an exception is that once an exception is possible, we should think about handling it. In order to handle such an exception, we need a way to react to it. The construct is the same as in C++ or Java: Thrown exceptions can be caught.
try
{
FunctionWhichMightThrowException();
}
catch
{
}
In the example, we call a method named FunctionWhichMightThrowException
. Calling this method might result in an exception, which is why we put it in a try
-block. The catch
-block is only entered if an exception is thrown, otherwise it will be ignored. What this example is not capable of doing is reacting to the specific exception. Right now, we just react to any exception, without touching the exception that has been thrown. This is, however, very important and should therefore be done:
try
{
FunctionWhichMightThrowException();
}
catch (Exception ex)
{
Console.WriteLine(ex.Message);
}
Since every exception has to derive from Exception
, this will always work and we will always be able to access to the property Message
. This is a so called catch'em all block. Sometimes, however, we want to distinguish between the various exceptions. Coming back to our example with the file system above, we can expect that every unique scenario (e.g., path invalid, file not found, insufficient rights, ...) will throw a different kind of exception. We could differentiate between those exceptions by defining more catch
-blocks:
byte[] content = null;
try
{
content = File.ReadAllBytes();
}
catch (PathTooLongException)
{
}
catch (FileNotFoundException)
{
}
catch (UnauthorizedAccessException)
{
}
catch (IOException)
{
}
catch (Exception)
{
}
There should be two lessons from this example.
- We can specify multiple
catch
-blocks, each with its own handling. The only limitation is that we should specify it in such a order, that the most general exception is called last, while the most specific is first. - We do not need to name the variable of the exception. If we name it, we will get access to the
Exception
object, but sometimes we do not care about the specific object. Instead, we just want to differentiate between the various exceptions.
Now that we can catch those nasty exceptions, we may want to throw exceptions ourselves. This is done by using the throw
keyword. Let's see some sample code:
void MyBuggyMethod()
{
Console.WriteLine("Entering my method");
throw new Exception("This is my exception");
Console.WriteLine("Leaving my method");
}
If we call this method, we will see that the second WriteLine
method will not be called. Once an exception is thrown, the method is left immediately. This goes on until a suitable try
-catch
-block is wrapping the method call. If no such block is found, then the application will crash. This behavior is called bubbling. Alternatively, we could have also written our own class that derives from Exception
:
class MyException : Exception
{
public MyException()
: base("This is my exception")
{
}
}
Now our code above could have been changed to become the following:
void MyBuggyMethod()
{
Console.WriteLine("Entering my method");
throw new MyException();
Console.WriteLine("Leaving my method");
}
Coming back again to our example that plays around with the file system. In this scenario, we might end up with some open file handle. Therefore, whether we get some exception or not, we want to close that handle to clean up the open resources. In this scenario, another block would be very helpful. A block that performs a final action that does not depend on the actions in the try
or any catch
block. Of course, such a block exists and is called a finally
-block.
FileStream fs = null;
try
{
fs = new FileStream("Path to the file", FileMode.Open);
}
catch (Exception)
{
Console.WriteLine("An exception occurred.");
return;
}
finally
{
if (fs != null)
fs.Close();
}
Here, we should note that return
in one block will still call the code in the finally
-block. So in total, we have the option of using a try
-catch
, a try
-catch
-finally
or a try
-finally
block. The last one will not catch the exception (i.e., the exception will bubble up), but still invoke the code that is given in the finally
-block (no matter what happens in the try
-block).
Outlook
In the next tutorial, we will learn about more advanced features in C# and extend our knowledge in object-oriented programming. With our knowledge in C# improving, we are ready to dive more into the .NET-Framework.
Other Articles in this Series
- Lecture Notes Part 1 of 4 - An Advanced Introduction to C#
- Lecture Notes Part 2 of 4 - Mastering C#
- Lecture Notes Part 3 of 4 - Advanced Programming with C#
- Lecture Notes Part 4 of 4 - Professional Techniques for C#
References
History
- v1.0.0 | Initial release | 18th April, 2016
- v1.0.1 | Added article list | 19th April, 2016
- v1.0.2 | Refreshed article list | 20th April, 2016
- v1.0.3 | Refreshed article list | 21st April, 2016
- v1.1.0 | Thanks to Steve for pointing out VS Community Edition | 22nd April, 2016
- v1.1.1 | Updated some typos | 23rd April, 2016
- v1.2.0 | Updated structure w. anchors | 25th April, 2016
- v1.2.1 | Added table of contents | 29th April, 2016
- v1.2.2 | Remarks from Kenneth Haugland and Prateek Dalbehera added | 3rd May, 2016
- v1.2.3 | Updated with corrections from Austria | 5th May, 2016
- v1.2.4 | Clarified the considered overload signature | 2nd October, 2016
- v1.2.5 | Mentioned VS CE, Code, ST, Atom, OmniSharp | 4th October, 2016