lvalues and rvalues and lvalue reference types and rvalue reference types are a fairly reliable path to insanity….. or are they?
This post is an attempt to cement a few things in my mind as well as explain to those who are interested what on earth is going on with lvalues, rvalues and references. If after reading this, you are none the wiser, then I strongly advise you go and check out Scott Meyers talk on Universal References on Channel 9. He can explain it a lot better than I can.
lvalue vs rvalue
Ok, so first what is an lvalue and what is an rvalue? My understanding is this:
An lvalue is an assigned value and an rvalue is a non assigned value. In this example, x
is an lvalue and 10
is an rvalue:
int x = 10;
Also, in this case, the result of a + b
is an rvalue:
int x = a + b;
“So what?”, I hear you ask. Well, it turns out that move semantics make use of this terminology to provide a type that can be readily identified as one that can be moved rather than copied. An rvalue is by its nature transient and it is this transience that provides the hint that perhaps we can use this feature to assist in move semantics.
Move Semantics
The best example to explain why move semantics are so important is a large collection of objects such as a vector or some other list. If we are passing vectors around by value, every time we assign an instance, we call the copy constructor on that vector and each element is allocated and assigned in the new vector. Sometimes, we are not interested in the state of the original vector and the overhead could prove unacceptable. It would be much more efficient to move the allocated memory from one vector to the other and performance would be improved significantly.
We could do this ourselves, but C++ 11 now has a feature, rvalue references, that allows us to identify situations where we could do this.
The reason it is called Move Semantics, as I understand it, is because the moving described above can be done without rvalue references. Rvalue references are just a means to allow differentiation and therefore overloading, in order to provide both copy and move constructors, assignment operators and other methods.
So how about some examples.
An Example using Move Semantics
So let's put together a quick example
class:
class Example
{
public:
Example(const Example& other)
{
mList = other.mList;
}
Example(Example&& other)
{
mList = std::move(other);
}
private:
std::vector<std::string> mList;
};
The two constructors include a standard copy constructor and a move constructor using an RVALUE REFERENCE TYPE – Example&&
. Note that the ‘&&
’ is an individual token on its own rather than two &
tokens as far as the compiler is concerned. We also use std::move
to convert other to an rvalue as due to it being an assigned variable is an lvalue of TYPE = RVALUE REFERENCE TO AN Example OBJECT. This is important to get your head round.
There is also nothing stopping you from writing std::move
in the copy constructor, but semantically that would not make sense in a COPY
constructor. A MOVE
constructor is made possible by the new rvalue reference type, permitting overloading and allowing move SEMANTICS.
So if you are using named variables that are ALWAYS lvalues, how do we get an rvalue. As I mentioned earlier, an rvalue is a transient value. The call to std::move
RETURNS AN RVALUE REFERENCE TO AN RVALUE. Infact, std::move
is the standard way to convert an lvalue to an rvalue and hints at the possibility that other may change as it can now be moved.
So, to summarise, moving can obviously be done without rvalue reference types, but rvalue reference types provide a distinction that allows for standardised move semantics that are used throughout the standard library and can be used to optimize parts of code where copying would naturally take place.
Further Insanity Inducing Rvalue References
Hopefully, some of what I have said makes some form of sense and it is clear at least that there is a new reference construct that can help provide a standard way of optimization via move semantics. Now we are going to throw templates and, more generally, type deduction into the mix. Type deduction happens in templates, with the auto keyword and in a few other places in C++. I’m going to just look at templates, hopefully it will indicate how the other features are affected by this.
Let's create a couple of template functions:
template<T>
void doSomething(const std::vector<T>& withThis);
template<T>
void doSomething(std::vector<T>&& withThis);
We now have two overloaded function that can take advantage of scenarios that might require optimization.
We can call these as follows:
std::vector<std::string> v;
doSomething(v);
doSomething(std::move(v));
The first call uses the first template function and the second call, as it is getting an rvalue via std::move
, is calling the second template function. Quite simple really – the std::move
call tells the reader of the code that doSomething
will probably also ‘do something’ to v
and we should probably not expect to use v
beyond the call to doSomething
.
Now what if I had written the following template functions instead:
template<T>
void doSomething(const T& withThis);
template<T>
void doSomething(T&& withThis);
It will probably surprise you that in the code above where doSomething
is called, they will BOTH call the second template function.
If I were to call the function as follows, the first override would be called:
const std::vector<std::string> v;
doSomething(v);
“Eh?” I hear you ask.
The important thing here is TYPE DEDUCTION. The first example of the doSomething
functions, the type is a std::vector<T>&
the T
is deduced, BUT the std::vector
is not.
In our new example of the doSomething
functions, we do have a deduced type T
. Deduced types can be const
, non const
, rvalue
references, lvalue
references, in fact anything. So in our new cases, the compiler INTERNALLY, created the following instantiated function calls from the template:
void doSomething(std::vector<std::string>& && withThis);
void doSomething(std::vector<std::string>&& && withThis);
These statements are illegal in written code, but internally that is how the compiler sorts out type deduction. So, you may ask, “What the hell is the type once it makes its way to the function body???”. This is obviously an important question as the semantics of the type used to call the function are blurred if not careful.
Well it turns out that the references collapse, following to the following rules:
T& & -> T&
T&& & -> T&
T& && -> T&
T&& && -> T&&
Still sane?
So what we are effectively telling the compiler to instantiate in the way of function calls is:
void doSomething(std::vector<std::string>& withThis);
void doSomething(std::vector<std::string>&& withThis);
This means that the body of the template function could be receiving an lvalue reference or an rvalue reference type, so we have to accept that semantically, it should not assume that the caller expects move semantics to come into play. So if we actually moved withThis
, using std::move
for example, the state of the application might not be what the caller expected it to be after the call to doSomething
.
Fortunately, the standard library comes to our aid with std::forward
to allow forwarding of the object to a non deduced function. I think I will leave it there for now, perhaps come back to std::forward
some other time. There is quite a good explanation in this thread, of how std::forward
works.
If you are still none the wiser, watch what Scott Meyers has to say about it, he’s a bit more of an expert than me.