This is a chapter excerpt from C++/CLI in Action authored by Nishant Sivakumar
and published by Manning Publications. The content has been reformatted for
CodeProject and may differ in layout from the printed book and the e-book.
1.5 Instantiating CLI classes
In this section, you'll see how CLI classes are instantiated using the
gcnew
operator. You'll also learn how constructors, copy constructors,
and assignment operators work with managed types. Although the basic concepts
remain the same, the nature of the CLI imposes some behavioral differences in
the way constructors and assignment operators work; when you start writing
managed classes and libraries, it's important that you understand those
differences. Don't worry about it, though. Once you've seen how managed objects
work with constructors and assignment operators, the differences between
instantiating managed and native objects will automatically become clear.
1.5.1 The gcnew operator
The gcnew
operator is used to instantiate CLI objects. It
returns a handle to the newly created object on the CLR heap. Although it's
similar to the new
operator, there are some important differences:
gcnew
has neither an array form nor a placement form, and it can't
be overloaded either globally or specifically to a class. A placement form
wouldn't make a lot of sense for a CLI type, when you consider that the memory
is allocated by the Garbage Collector. It's for the same reason you aren't
permitted to overload the gcnew
operator. There is no array form
for gcnew
because CLI arrays use an entirely different syntax from
native arrays, which we'll cover in detail in the next chapter. If the CLR can't
allocate enough memory for creating the object, a
System::OutOfMemoryException
is thrown, although chances are slim that
you'll ever run into that situation. (If you do get an
OutOfMemoryException
, and your system isn't running low on virtual
memory, it's likely due to badly written code such as an infinite loop that
keeps creating objects that are erroneously kept alive.) The following code
listing shows a typical usage of the
gcnew
keyword to instantiate a
managed object (in this case, the
Student
object):
ref class Student
{
...
};
...
Student^ student = gcnew Student();
student->SelectSubject("Math", 97);
The gcnew
operator is compiled into the newobj
MSIL
instruction by the C++/CLI compiler. The newobj
MSIL instruction
creates a new CLI object—either a ref
object on the CLR heap or a
value
object on the stack—although the C++/CLI compiler uses a
different mechanism to handle the usage of the gcnew
operator to
create value
type objects (which I'll describe later in this
section). Because gcnew
in C++ translates to newobj
in
the MSIL, the behavior of gcnew
is pretty much dependent on, and
therefore similar to, that of the newobj
MSIL instruction. In fact,
newobj
throws System::OutOfMemoryException
when it
can't find enough memory to allocate the requested object. Once the object has
been allocated on the CLR heap, the constructor is called on this object with
zero or more arguments (depending on the constructor overload that was used). On
successful completion of the call to the constructor, gcnew
returns
a handle to the instantiated object. It's important to note that if the
constructor call doesn't successfully complete, as would be the case if an
exception was raised inside the constructor, gcnew
won't return a
handle. This can be easily verified with the following code snippet:
ref class Student
{
public:
Student()
{
throw gcnew Exception("hello world");
}
};
Student^ student = nullptr;
try
{
student = gcnew Student(); }
catch(Exception^)
{
}
if(student == nullptr) Console::WriteLine("reference not allocated to handle");
Not surprisingly, student
is still nullptr
when it
executes the if
block. Because the constructor didn't complete
executing, the CLR concludes that the object hasn't fully initialized, and it
doesn't push the handle reference on the stack (as it would if the constructor
had completed successfully).
NOTE | C++/CLI introduces the concept of a universal null literal called
nullptr . This lets you use the same literal (nullptr )
to represent a null pointer and a null handle value. The nullptr
implicitly converts to a pointer or handle type; for the pointer, it
evaluates to 0, as dictated by standard C++; for the handle, it
evaluates to a null reference. You can use the nullptr in
relational, equality, and assignment expressions with both pointers and
handles. |
As I mentioned earlier, using gcnew
to instantiate a value
type object generates MSIL that is different from what is generated when you
instantiate a ref
type. For example, consider the following code,
which uses gcnew
to instantiate a value
type:
value class Marks
{
public:
int Math;
int Physics;
int Chemistry;
};
Marks^ marks = gcnew Marks();
For this code, the C++/CLI compiler uses the initobj
MSIL
instruction to create a Marks
object on the stack. This object is
then boxed to a Marks^
object. We'll discuss boxing and unboxing in
the next section; for now, note that unless it's imperative to the context of
your code to gcnew
a value
type object, doing so is
inefficient. A stack object has to be created, and this must be boxed to a
reference object. Not only do you end up creating two objects (one on the
managed stack, the other on the managed heap), but you also incur the cost of
boxing. The more efficient way to create an object of type Marks
(or any value
type) is to declare it on the stack, as follows:
Marks marks;
You've seen how calling gcnew
calls the constructor on the
instance of the type being created. In the coming section, we'll take a more
involved look at how constructors work with CLI types.
1.5.2 Constructors
If you have a ref
class, and you haven't written a default
constructor, the compiler generates one for you. In MSIL, the constructor is a
specially named instance method called .ctor
. The default
constructor that is generated for you calls the constructor of the immediate
base class for the current class. If you haven't specified a base class, it
calls the System::Object
constructor, because every ref
object implicitly derives from System::Object
. For example,
consider the following two classes, neither of which has a user-defined
constructor:
ref class StudentBase
{
};
ref class Student: StudentBase
{
};
Neither Student
nor StudentBase
has a user-provided default constructor, but
the compiler generates constructors for them. You can use a tool such as
ildasm.exe (the IL Disassembler that comes with the .NET Framework) to examine
the generated MSIL. If you do that, you'll observe that the generated
constructor for Student
calls the constructor for the StudentBase
object:
call instance void StudentBase::.ctor()
The generated constructor for StudentBase
calls the System::Object
constructor:
call instance void [mscorlib]System.Object::.ctor()
Just as with standard C++, if you have a constructor—either a default
constructor or one that takes one or more arguments—the compiler won't generate
a default constructor for you. In addition to instance constructors, ref
classes also support static
constructors (not available in standard
C++). A static
constructor, if present, initializes the
static
members of a class. Static constructors can't have parameters,
must also be
private
, and are automatically called by the CLR. In
MSIL,
static
constructors are represented by a specially named
static
method called
.cctor
. One possible reason both
special methods have a
.
in their names is that this avoids name
clashes, because none of the CLI languages allow a
.
in a function
name. If you have at least one
static
field in your class, the
compiler generates a default
static
constructor for you if you
don't include one on your own. When you have a simple class, such as the
following, the generated MSIL will have a
static
constructor even
though you haven't specified one:
ref class StudentBase
{
static int number;
};
Due to the compiler-generated constructors and the implicit derivation from
System::Object
, the generated class looks more like this:
ref class StudentBase : System::Object
{
static int number;
StudentBase() : System::Object()
{
}
static StudentBase()
{
}
};
A value
type can't declare a default constructor because the CLR
can't guarantee that any default constructors on value
types will
be called appropriately, although members are 0-initialized automatically by the
CLR. In any case, a value
type should be a simple type that
exhibits
value semantics, and it shouldn't need the complexity of a
default constructor—or even a destructor, for that matter. Note that in addition
to not allowing default constructors,
value
types can't have
user-defined destructors, copy constructors, and copy-assignment operators.
Before you end up concluding that value
types are useless, you
need to think of value
types as the POD equivalents in the .NET world. Use
value
types just as you'd use primitive types, such as
int
s
and
char
s, and you should be OK. When you need simple types,
without the complexities of
virtual
functions, constructors and
operators,
value
types are the more efficient option, because
they're allocated on the stack. Stack access will be faster than accessing an
object from the garbage-collected CLR heap. If you're wondering why this is so,
the stack implementation is far simpler when compared to the CLR heap. When you
consider that the CLR heap also intrinsically supports a complex
garbage-collection algorithm, it becomes obvious that the stack object is more
efficient.
It must be a tad confusing when I mention how value
types behave
differently from reference types in certain situations. But as a developer, you
should be able to distinguish the conceptual differences between value types and
reference types, especially when you design complex class hierarchies. As we
progress through this book and see more examples, you should feel more
comfortable with these differences.
Because we've already talked about constructors, we'll discuss copy
constructors next.
1.5.3 Copy constructors
A copy constructor is one that instantiates an object by creating a
copy of another object. The C++ compiler generates a copy constructor for your
native classes, even if you haven't explicitly done so. This isn't the case for
managed classes. Consider the following bit of code, which attempts to
copy-construct a ref
object:
ref class Student
{
};
int main(array<System::String^>^ args)
{
Student^ s1 = gcnew Student();
Student^ s2 = gcnew Student(s1); <<==(1)
}
If you run that through the compiler (1),
you'll get compiler error C3673 (class does not have a copy-constructor).
The reason for this error is that unlike in standard C++, the compiler won't
generate a default copy constructor for your class. At least one reason is that
all ref
objects implicitly derive from System::Object
,
which doesn't have a copy constructor. Even if the compiler attempted to
generate a copy constructor for a ref
type, it would fail, because
it wouldn't be able to access the base class copy constructor (it doesn't
exist).
To make that clearer, think of a native C++ class Base
with a
private
copy
constructor, and a derived class Derived
(that public
ly inherits from
Base
).
Attempting to copy-construct a Derived
object will fail because the base class
copy constructor is inaccessible. To demonstrate, let's write a class that is
derived from a base class that has a private
copy constructor:
class Base
{
public:
Base(){}
private:
Base(const Base&);
};
class Derived : public Base
{
};
int _tmain(int argc, _TCHAR* argv[])
{
Derived d1;
Derived d2(d1); }
Because the base object's copy constructor is declared as private
and therefore is inaccessible from the derived object, this code won't compile:
The compiler is unable to copy-construct the derived object. What happens with a
ref
class is similar to this code. In addition, unlike native C++
objects, which aren't polymorphic unless you access them via a pointer,
ref
objects are implicitly polymorphic (because they're always accessed
via reference handles to the CLR heap). This means a compiler-generated copy
constructor may not always do what you expect it to do. When you consider that
ref
types may contain member
ref
types, there is the
question of whether a copy constructor implements shallow copy or deep copy for
those members. The VC++ team presumably decided that there were too many
equations to have the compiler automatically generate copy constructors for
classes that don't define them.
If you want copy-construction support for your class, you must implement it
explicitly, which fortunately isn't a difficult task. Let's add a copy
constructor to the Student
class:
ref class Student
{
public:
Student(){}
Student(const Student^)
{
}
};
That wasn't all that tough, was it? Notice how you have to explicitly add a
default parameterless constructor to the class. This is because it won't be
generated by the compiler when the compiler sees that there is another
constructor present. One limitation with this copy constructor is that the
parameter has to be a Student^
, which is OK except that you may
have a Student
object that you want to pass to the copy
constructor. If you're wondering how that's possible, C++/CLI supports stack
semantics, which we'll cover in detail in chapter 3. Assume that you have a
Student
object s1
instead of a Student^
,
and you need to use that to invoke a copy constructor:
Student s1;
Student^ s2 = gcnew Student(s1);
As you can see, that code won't compile. There are two ways to resolve the
problem. One way is to use the unary % operator
on the s1
object to get a handle to the Student
object:
Student s1;
Student^ s2 = gcnew Student(%s1);
Although that compiles and solves the immediate problem, it isn't a complete
solution when you consider that every caller of your code needs to do the same
thing if they have a Student
object instead of a Student^
.
An alternate solution is to have two overloads for the copy constructor, as
shown in listing 1.2.
ref class Student
{
public:
Student(){}
Student(String^ str):m_name(str){}
Student(const Student^) <<==(1)
{
}
Student(const Student%) <<==(2)
{
}
};
Student s1;
Student^ s2 = gcnew Student(s1);
Listing 1.2 Declaring two overloads for the copy constructor
This solves the issue of a caller requiring the right form of the object, but
it brings with it another problem: code duplication. You could wrap the common
code in a private
method and have both overloads of the copy
constructor call this method, but then you couldn't take advantage of
initialization lists.
Eventually, it's a design choice you have to make. (1) If you only
have the copy constructor overload taking a Student^
, then you need
to use the unary %
operator when you have a Student
object; and (2) if you only have
the overload taking a Student%
, then you need to dereference a
Student^
using the *
operator before using it in
copy-construction. If you have both, you may end up with possible code
duplication; and the only way to avoid code duplication (using a common function
called by both overloads) deprives you of the ability to use initialization
lists.
My recommendation is to use the overload that takes a handle (in the previous
example, the one that takes a Student^
), because this overload is
visible to other CLI languages such as C# (unlike the other overload)—which is a
good thing if you ever run into language interop situations. The unary %
operator won't really slow down your code; it's just an extra character that you
need to type. I also suggest that you stay away from using two overloads, unless
it's a specific case of a library that will be exclusively used by C++ callers;
even then, you must consider the issue of code duplication.
Now you know that if you need copy construction on your ref
types, you must implement it yourself. So, it may not be surprising to see in
the next section that the same holds true for copy-assignment operators.
1.5.4 Assignment operators
The copy-assignment operator is one that the compiler generates automatically
for native classes in standard C++, but this isn't so for a ref
class. The reasons are similar to those that dictate that a copy constructor
isn't automatically generated. The following code (using the Student
class defined earlier) won't compile:
Student s1("Nish");
Student s2;
s2 = s1;
Defining an assignment operator is similar to what you do in standard C++,
except that the types are managed:
Student% operator=(const Student% s)
{
m_name = s.m_name;
return *this;
}
Note that the copy-assignment operator can be used only by C++ callers,
because it's invisible to other languages like C# and VB.NET. Also note that,
for handle variables, you don't need to write a copy-assignment operator,
because the handle value is copied over intrinsically.
You should try to bring many of the good C++ programming practices you
followed into the CLI world, except where they aren't applicable. As an example,
the assignment operator doesn't handle self-assignment. Although it doesn't
matter in our specific example, consider the case in listing 1.3.
ref class Grades <<==(1)
{
};
ref class Student
{
String^ m_name;
Grades^ m_grades;
public:
Student(){}
Student(String^ str):m_name(str){}
Student% operator=(const Student% s)
{
m_name = s.m_name;
if(m_grades) [#2]
delete m_grades; <<==(2)
m_grades = s.m_grades;
return *this;
}
void SetGrades(Grades^ grades)
{
}
};
Listing 1.3 The self-assignment problem
In the preceding listing, (1) assume that Grades
is a
class with a nontrivial constructor and destructor; thus, in the Student
class assignment operator, before the m_grades
member is copied, (2) the existing
Grades
object is explicitly disposed by calling
delete
on
it—all very efficient. Let's assume that a self-assignment occurs:
while(some_condition)
{
studarr[i++] = studarr[j--]; if(some_other_condition)
break;
}
In the preceding code snippet, if ever i
equals j
,
you end up with a corrupted Student
object with an invalid
m_grades
member. Just as you would do in standard C++, you should check
for self-assignment:
Student% operator=(const Student% s)
{
if(%s == this)
{
return *this;
}
m_name = s.m_name;
if(m_grades)
delete m_grades;
m_grades = s.m_grades;
return *this;
}
We've covered some ground in this section—and if you feel that a lot of
information has been presented too quickly, don't worry. Most of the things
we've discussed so far will come up again throughout this book; eventually, it
will all make complete sense to you. We'll now look at boxing and unboxing,
which are concepts that I feel many .NET programmers don't properly
understand—with not-so-good consequences.