In this article, we will discuss the move semantics for C++. We will try to figure out what exactly move is.
Problem Statement
Copying is not the optimal solution in all situations. Some situations demand moving as copying may mean duplication of resources and it may be an intensive task.
Another problem arises due to temporaries. These temporaries may blog memory and slow down C++ execution.
Solution
Solution comes inform of the move semantics. We will gradually discover what this is.
rvalue Reference
RValues
is a new addition to C++11. We will see what its purpose is and why it was implemented.
The original definition of lvalues
and rvalues
is as follows:
As per C style definition, an lvalue
is an expression that may appear on the left or on the right hand side of an assignment, whereas an rvalue
is an expression that can only appear on the right hand side of an assignment.
int a = 42;
int b = 43;
a = b; b = a; a = a * b;
int c = a * b; a * b = 42;
C++ with its user-defined types has introduced some subtleties regarding modifiability and assignability that cause this definition to be incorrect. So now what we tell lvalue
is an expression that represents a memory location. This lvalue
lets us take the address of the location. What is a rvalue
? Simple, whatever is not a lvalue
.
Now let us formally define these terms and properties a little better.
First, what is an expression?
- An expression is a sequence of operators and operands. A expression is a statement that specifies a computation. It tells the computer or say C++ what to do.
- An expression can produce a value like "1+2;// It's value is 3"
- An expression can have side effects too like a function call.
- An expression can be simple or complex.
- Each expression has a type and a value category, i.e., if the expression is
lvalue
, rvalue
, etc.
lvalue
An lvalue
is an expression that identifies a non-temporary object or a non-member function.
- Address of a
lvalue
can be taken. - Modifiable, i.e., non
const lvalue
can be used on the left side of the "=" lvalue
can be used to initialize the lvalue
reference. - Sometimes when permitted,
lvalue
can have incomplete type. - An expression that designates a bit field (e.g.
s.x
where s
is an object of type struct S { int x:3; };
) is an lvalue
expression (or xvalue
if s
is one): it may be used on the left hand side of the assignment operator, but its address cannot be taken and a non-const lvalue
reference cannot be bound to it. A const lvalue
reference can be initialized from a bit-field lvalue
, but a temporary copy of the bit-field will be made: it won't bind to the bit field directly.
Examples
- The name of a variable or function in scope, regardless of type, such as
std::cin
or std::endl
. Even if the variable's type is rvalue
reference, the expression consisting of its name is an lvalue
expression. - Function call or overloaded operator expression if the function's or overloaded operator's return type is an
lvalue
reference, such as std::getline(std::cin, str)
or std::cout << 1
or str1 = str2
or ++iter
- Built-in pre-increment and pre-decrement, dereference, assignment and compound assignment, subscript (except on an array
xvalue
), member access (except for non-static
non-reference members of xvalues
, member enumerators, and non-static
member functions), member access through pointer to data member if the left-hand operand is lvalue
, comma operator if the right-hand operand is lvalue
, ternary conditional if the second and third operands are lvalues
. - Cast expression to
lvalue
reference type. - String literal
- Function call expression if the function's return type is
rvalue
reference to function type - Cast expression to
rvalue
reference to function.
prvalue
A pure rvalue
(prvalue
) is an expression that identifies a temporary object (or a subobject
thereof) or is a value not associated with any object.
- It can be a rvalue
- a
prvalue
cannot be polymorphic: the dynamic type of the object it identifies is always the type of the expression - a non-class non-array
prvalue
cannot be const
-qualified. - a
prvalue
cannot have incomplete type (except for type void
, see below) - The expressions
obj.func
and ptr->func
, where func
is a non-static
member function, and the expressions obj.*mfp
and ptr->*mfp
where mfp
is a pointer to member function, are classified as prvalue
expressions, but they cannot be used to initialize references, as function arguments, or for any purpose at all, except as the left-hand argument of a function call expression, e.g. (pobj->*ptr)(args)
. - Function call expressions returning
void
, cast expressions to [cpp]void[/cpp]
, and [cpp]throw-expressions[/cpp]
are classified as prvalue
expressions, but they cannot be used to initialize references or as function arguments. They can be used in some contexts (e.g. on a line of its own, as the left argument of the comma operator, etc.) and in the return
statement in a function returning void
Examples
- Literal (except
string
literal), such as 42
or true
or nullptr
. - Function call or overloaded operator expression if the function's or the overloaded operator's
return
type is not a reference, such as str.substr(1, 2)
or str1 + str2
- Built-in post-increment and post-decrement, arithmetic and logical operators, comparison operators, address-of operator, member access for a member enumerator, a non-
static
member function, or a non-static
non-reference data member of an rvalue, member access through pointer to a data member of rvalue
or to a non-static
member function, comma operator where the right-hand operand is rvalue
, ternary conditional where either second or third operands aren't lvalues
. - Cast expression to any type other than reference type.
- Lambda expressions, such as
[](int x){return x*x;}
xvalue
An xvalue
is an expression that identifies an "eXpiring
" object, that is, the object that may be moved from. The object identified by an xvalue
expression may be a nameless temporary, it may be a named object in scope, or any other kind of object, but if used as a function argument, xvalue
will always bind to the rvalue
reference overload if available.
- It can be either
rvalue
or - it can also be a
gvalue
- Like
prvalues
, xvalues
bind to rvalue
references - Unlike
prvalues
, an xvalue
may be polymorphic, and a non-class xvalue
may be cv-qualified.
Examples
- A function call or overloaded operator expression if the function's or the overloaded operator's return type is an
rvalue
reference to object type, such as std::move(val)
- A cast expression to an
rvalue
reference to object type, such as static_cast<T&&>(val)
or (T&&)val
- A non-
static
class member access expression, in which the object expression is an xvalue
- A pointer-to-member expression in which the first operand is an
xvalue
and the second operand is a pointer to data member.
gvalue
A glvalue
("generalized" lvalue
) is an expression that is either an lvalue
or an xvalue
.
- Mostly its properties are as applies to pre-C++11
lvalues
- A
glvalue
may be implicitly converted to prvalue
with lvalue
-to-rvalue
, array-to-pointer, or function-to-pointer implicit conversion. - A
glvalue
may be polymorphic: the dynamic type of the object it identifies is not necessarily the static
type of the expression.
rvalue
An rvalue
is an expression that is either a prvalue
or an xvalue
.
- It has properties that apply to both
xvalues
and prvalues
, which means they apply to the pre-C++11 rvalues
as well - Address of an
rvalue
may not be taken: &int()
, &i++[3]
, &42
, and &std::move(val)
are invalid. - An
rvalue
may be used to initialize a const lvalue
reference, in which case the lifetime of the object identified by the rvalue
is extended until the scope of the reference ends. - An
rvalue
may be used to initialize an rvalue
reference, in which case the lifetime of the object identified by the rvalue
is extended until the scope of the reference ends. - When used as a function argument and when two overloads of the function are available, one taking
rvalue
reference parameter and the other taking lvalue
reference to const
parameter, rvalues
bind to the rvalue
reference overload (thus, if both copy and move constructors are available, rvalue
arguments invoke the move constructor, and likewise with copy and move assignment operators).
Moving
Let's say we have a 3D model class. The model class holds textures that are image files, vertex points that can spawn to thousands, color info for each vertex. Say like:
class Vertex {
public:
void addVertex ( ) {
}
~Vertex ( ) {
}
};
class Texture {
public:
void load ( ) {
}
~Texture ( ) {
}
};
class Model3D {
private:
Vertex* _ver;
Texture* _tex;
public:
void initialize ( ) {
_ver = new Vertex;
_tex = new Texture;
for ( int i = 0; i<10000; ++i ) {
_ver->addVertex ( );
}
for ( int i = 0; i<500; ++i ) {
_tex->load ( );
}
}
~Model3D ( ) {
delete _ver;
delete _tex;
}
};
Model3D retGraphics ( ) {
Model3D g;
return g;
}
Model3D g1 = retGraphics ( );
Here, as you can see, the ThreeD
model class does some heavy duty vertex and texture loading. Now take a look at the statement Model3D g1 = retGraphics ( );
. This statement can be converted to the following pseudo code.
Model3D tempG;
Model3D retGraphics ( ) {
Model3D g;
tempG = g; g->~Model3D( );
}
Model3D g1 = tempG; tempG->~Model3D ( );
As you see, there is a temporary involved. That means the vertex and texture destruction and loading happens 2ce. This is a time consuming and unnecessary process. So now the intelligent programmer is left with writing some code that can actually do the swapping of resource, rather than let the resource get destroyed. This is again time consuming and boring work, but all have to do it increasing the source size. Won’t it be good if the language did it, hence reducing the burden from the programmer? Well C++ does exactly that with the move functionality. So it does something like:
Model3D& Model3D::operator = ( <move type> rhs ) {
}
This is the reason C++ creates an overload with the move type, which is a special type to tell the compiler to move the resources rather than do the delete and construct operation. With the move type in play, the compiler deals with the following choices:
- Move type must be a reference
- When there is a choice between two overloads where one is an ordinary reference and the other is the mystery type, then
rvalues
must prefer the mystery type lvalues
must prefer the ordinary reference
So what exactly is this move type? This is the rvalue
reference, i.e., Model3D&&
.
Model3D&
is called the lvalue
reference. So what are the properties of the rvalue
reference now?
- During function overload resolution
lvalue
prefers lvalue
reference and rvalue
prefers rvalue
reference.
void f ( Model3D& m); void f ( Model3D&& m);
f ( g1 ); f ( retGraphics ( ) );
- We can overload any function with
rvalue
. But mostly in practice, the copy constructor and assignment operator.
So what happens if you implement rvalue
and forget the lvalue
overloads? Well, try it yourself. We will cover it later.
For more on move and rvalue
, please refer to the blog.
Bibliography