Commit c80f4dfa authored by GILLES Sebastien's avatar GILLES Sebastien
Browse files

Add RVO / copy elision in move semantics notebook.

parent 9e4c8cce
%% Cell type:markdown id: tags:
# [Getting started in C++](/) - [Useful concepts and STL](/notebooks/5-UsefulConceptsAndSTL/0-main.ipynb) - [Move semantics](/notebooks/5-UsefulConceptsAndSTL/5-MoveSemantics.ipynb)
%% Cell type:markdown id: tags:
<h1>Table of contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Motivation:-eliminate-unnecessary-deep-copies" data-toc-modified-id="Motivation:-eliminate-unnecessary-deep-copies-1">Motivation: eliminate unnecessary deep copies</a></span></li><li><span><a href="#A-traditional-answer:-to-allow-the-exchange-of-internal-data" data-toc-modified-id="A-traditional-answer:-to-allow-the-exchange-of-internal-data-2">A traditional answer: to allow the exchange of internal data</a></span></li><li><span><a href="#Reminder-on-references-in-C++03" data-toc-modified-id="Reminder-on-references-in-C++03-3">Reminder on references in C++03</a></span></li><li><span><a href="#C++11/14-:-temporary-references" data-toc-modified-id="C++11/14-:-temporary-references-4">C++11/14 : temporary references</a></span></li><li><span><a href="#Function-with-r-value-arguments" data-toc-modified-id="Function-with-r-value-arguments-5">Function with r-value arguments</a></span></li><li><span><a href="#std::move" data-toc-modified-id="std::move-6"><code>std::move</code></a></span></li><li><span><a href="#Move-constructors" data-toc-modified-id="Move-constructors-7">Move constructors</a></span></li><li><span><a href="#Temporary-reference-argument-within-a-function" data-toc-modified-id="Temporary-reference-argument-within-a-function-8">Temporary reference argument within a function</a></span></li><li><span><a href="#Move-semantics-in-the-STL" data-toc-modified-id="Move-semantics-in-the-STL-9">Move semantics in the STL</a></span></li><li><span><a href="#Forwarding-reference-(or-universal-reference)" data-toc-modified-id="Forwarding-reference-(or-universal-reference)-10">Forwarding reference (or universal reference)</a></span></li></ul></div>
<div class="toc"><ul class="toc-item"><li><span><a href="#Motivation:-eliminate-unnecessary-deep-copies" data-toc-modified-id="Motivation:-eliminate-unnecessary-deep-copies-1">Motivation: eliminate unnecessary deep copies</a></span></li><li><span><a href="#A-traditional-answer:-to-allow-the-exchange-of-internal-data" data-toc-modified-id="A-traditional-answer:-to-allow-the-exchange-of-internal-data-2">A traditional answer: to allow the exchange of internal data</a></span></li><li><span><a href="#Reminder-on-references-in-C++03" data-toc-modified-id="Reminder-on-references-in-C++03-3">Reminder on references in C++03</a></span></li><li><span><a href="#C++11/14-:-temporary-references" data-toc-modified-id="C++11/14-:-temporary-references-4">C++11/14 : temporary references</a></span></li><li><span><a href="#Function-with-r-value-arguments" data-toc-modified-id="Function-with-r-value-arguments-5">Function with r-value arguments</a></span></li><li><span><a href="#std::move" data-toc-modified-id="std::move-6"><code>std::move</code></a></span></li><li><span><a href="#Return-value-optimization-(RVO)-and-copy-elision" data-toc-modified-id="Return-value-optimization-(RVO)-and-copy-elision-7">Return value optimization (RVO) and copy elision</a></span></li><li><span><a href="#Move-constructors" data-toc-modified-id="Move-constructors-8">Move constructors</a></span></li><li><span><a href="#Temporary-reference-argument-within-a-function" data-toc-modified-id="Temporary-reference-argument-within-a-function-9">Temporary reference argument within a function</a></span></li><li><span><a href="#Move-semantics-in-the-STL" data-toc-modified-id="Move-semantics-in-the-STL-10">Move semantics in the STL</a></span></li><li><span><a href="#Forwarding-reference-(or-universal-reference)" data-toc-modified-id="Forwarding-reference-(or-universal-reference)-11">Forwarding reference (or universal reference)</a></span></li></ul></div>
%% Cell type:markdown id: tags:
## Motivation: eliminate unnecessary deep copies
In many situations, unnecessary deep copies are made.
In the example below, during the exchange between the two instances of the `Text` class, we have to make 3 memory deallocations, 3 allocations, 3 character copy loops... where 3 pointer copies would be sufficient.
%% Cell type:code id: tags:
``` C++17
#include <cstring>
#include <iostream>
class Text
{
public :
// For next section - don't bother yet
friend void swap(Text& lhs, Text& rhs);
Text(const char* string);
// Copy constructor.
Text(const Text& t);
// Recopy operator; defined here due to an issue of Xeus-cling with operators
Text& operator=(const Text& t)
{
std::cout << "Operator= called" << std::endl;
if (this == &t)
return *this ; // standard idiom to deal with auto-recopy
delete [] data_;
size_ = t.size_ ;
data_ = new char[t.size_] ;
std::copy(t.data_, t.data_ + size_, data_);
return *this ;
}
~Text();
// Overload of operator<<, defined here due to an issue of Xeus-cling with operators.
friend std::ostream & operator<<(std::ostream& stream, const Text& t)
{
return stream << t.data_ ;
}
private :
unsigned int size_{0};
char* data_ = nullptr;
} ;
```
%% Cell type:code id: tags:
``` C++17
Text::Text(const char* string)
{
std::cout << "Constructor called" << std::endl;
size_ = std::strlen(string) + 1;
data_ = new char[size_] ;
std::copy(string, string + size_, data_);
}
```
%% Cell type:code id: tags:
``` C++17
Text::Text(const Text& t)
: size_(t.size_), data_(new char [t.size_])
{
std::cout << "Copy constructor called" << std::endl;
std::copy(t.data_, t.data_ + size_, data_);
}
```
%% Cell type:code id: tags:
``` C++17
Text::~Text()
{
std::cout << "Destructor called" << std::endl;
delete[] data_;
}
```
%% Cell type:code id: tags:
``` C++17
{
Text t1("world!") ;
Text t2("Hello") ;
// Swap of values:
Text tmp = t1 ;
t1 = t2 ;
t2 = tmp ;
std::cout << t1 << " " << t2 << std::endl;
}
```
%% Cell type:markdown id: tags:
## A traditional answer: to allow the exchange of internal data
By allowing two texts to exchange (swap) their internal data, we can rewrite our program in a much more economical way in terms of execution time:
%% Cell type:code id: tags:
``` C++17
void swap(Text& lhs, Text& rhs)
{
unsigned int tmp_size = lhs.size_;
char* tmp_data = lhs.data_;
lhs.size_ = rhs.size_;
lhs.data_ = rhs.data_;
rhs.size_ = tmp_size;
rhs.data_ = tmp_data;
}
```
%% Cell type:code id: tags:
``` C++17
{
Text t1("world!") ;
Text t2("Hello") ;
// Swap of values:
swap(t1, t2);
std::cout << t1 << " " << t2 << std::endl;
}
```
%% Cell type:markdown id: tags:
There is even a `std::swap` in the STL that may be overloaded for your own types.
Now let's see how C++11 introduces new concepts to solve this (and many other) problems in a more elegant way.
## Reminder on references in C++03
C++ references allow you to attach a new name to an existing object in the stack or heap. All accesses and modifications made through the reference affect the original object:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
{
int var = 42;
int& ref = var; // Create a reference to var
ref = 99;
std::cout << "And now var is also 99: " << var << std::endl;
}
```
%% Cell type:markdown id: tags:
A reference can only be attached to a stable value (left value or **l-value**), which may broadly be summarized as a value which address may be taken (see \cite{Meyers2015} on this topic - its reading is especially interesting concerning this topic that is not always explained properly elsewhere - especially on the Web).
By opposition a **r-value** is a temporary value such as a literal expression or a temporary object created by implicit conversion.
%% Cell type:code id: tags:
``` C++17
{
int& i = 42 ; // Compilation error: 42 is a r-value!
}
```
%% Cell type:code id: tags:
``` C++17
#include <iostream>
void print(std::string& lvalue)
{
std::cout << "l-value is " << lvalue << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
{
print("hello") ; // Compilation error: "hello" is a r-value!
}
```
%% Cell type:markdown id: tags:
Look carefully at the error message: the issue is not between `const char[6]` and `std::string` (implicit conversion from `char*` to `std::string` exists) but due to the reference; same function with pass-by-copy works seemlessly:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
#include <string>
void print_by_copy(std::string value) // no reference here!
{
std::cout << "l- or r- value is " << value << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
{
print_by_copy("hello") ; // Ok!
}
```
%% Cell type:markdown id: tags:
Noteworthy exception: a reference "constant" (language abuse designating a reference to a constant value) can be attached to a temporary value, in particular to facilitate implicit conversions:
%% Cell type:code id: tags:
``` C++17
void print_by_const_ref(const std::string& lvalue)
{
std::cout << "l-value is " << lvalue << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
{
print_by_const_ref("hello") ; // Ok!
}
```
%% Cell type:markdown id: tags:
## C++11/14 : temporary references
To go further, C++11 introduces the concept of **r-value reference**, which can only refer to temporary values, and is declared using an `&&`.
%% Cell type:code id: tags:
``` C++17
{
int&& i = 42;
}
```
%% Cell type:code id: tags:
``` C++17
{
int j = 42;
int&& k = j; // Won’t compile: j is a l-value!
}
```
%% Cell type:markdown id: tags:
It is now possible to overload a function to differentiate the treatment to be performed according to whether it is provided with a stable value or a temporary value. Below, function `f` is provided in three variants:
````
void f(T&); // I : argument must be a l-value
void f(const T&) ; // II : argument may be l-value or r-value but can't be modified
void f(T&&); // III : argument must be a r-value
````
%% Cell type:markdown id: tags:
In case of a call of `f` with a temporary value, it is now form III that will be invoked, if it is defined. This is the cornerstone of the notion of **move semantic**.
%% Cell type:markdown id: tags:
## Function with r-value arguments
When we know that a value is temporary, we must be able to use it again, or "loot" its content without harmful consequences; _move_ it instead of _copying_ it. When handling large dynamic data structures, it can save many costly operations.
Let's take a function that receives a vector of integers and replicates it to modify it. The old way would be as follows:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
#include <vector>
void print_double(const std::vector<int>& vec)
{
std::cout << "print_double for l-value" << std::endl;
std::vector<int> copy(vec);
for (auto& item : copy)
item *= 2;
for (auto item : copy)
std::cout << item << "\t";
std::cout << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
{
std::vector<int> primes { 2, 3, 5, 7, 11, 13, 17, 19 };
print_double(primes);
}
```
%% Cell type:markdown id: tags:
If the original object is temporary, copying it is not necessary. This can be exploited through this overload of the function:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
void print_double(std::vector<int>&& vec)
{
std::cout << "print_double for r-value" << std::endl;
for (auto& item : vec)
item *= 2;
for (auto item : vec)
std::cout << item << "\t";
std::cout << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
{
print_double(std::vector<int>{ 2, 3, 5, 7, 11, 13, 17, 19 });
}
```
%% Cell type:markdown id: tags:
## `std::move`
Now, if we get a l-value and know we do not need it anymore in the current scope, we may choose to cast is as a r-value through a **static_cast**:
%% Cell type:code id: tags:
``` C++17
{
std::vector<int> primes { 2, 3, 5, 7, 11, 13, 17, 19 };
print_double(static_cast<std::vector<int>&&>(primes));
}
```
%% Cell type:markdown id: tags:
And we see overload call is properly the one for r-values.
The syntax is a bit heavy to type, so a shorter one was introduced as well: **`std::move`**:
%% Cell type:code id: tags:
``` C++17
{
std::vector<int> primes { 2, 3, 5, 7, 11, 13, 17, 19 };
print_double(std::move(primes)); // strictly equivalent to the static_cast in former cell!
}
```
%% Cell type:markdown id: tags:
Please notice that the call to `std::move` does not move `primes` per se. It only makes it a temporary value in the eyes of the compiler, so it is a "possibly" movable object if the context allows it; if for instance the object doesn't define a move constructor (see next section), no move will occur!
%% Cell type:markdown id: tags:
## Return value optimization (RVO) and copy elision
When you define a function which returns a (possibly large) object, you might be worried unneeded copy is performed:
%% Cell type:code id: tags:
``` C++17
#include <vector>
std::vector<unsigned int> FiveDigitsOfPi()
{
std::vector<unsigned int> ret { 3, 1, 4, 1, 5 };
return ret; // copy should be incurred here... Right? (No in fact!)
}
```
%% Cell type:markdown id: tags:
and attempt to circumvent it by a `std::move`:
%% Cell type:code id: tags:
``` C++17
#include <vector>
std::vector<unsigned int> FiveDigitsOfPi_WithMove()
{
std::vector<unsigned int> ret { 3, 1, 4, 1, 5 };
return std::move(ret); // Don't do that!
}
```
%% Cell type:markdown id: tags:
or even to avoid entirely returning a large object by using a reference:
%% Cell type:code id: tags:
``` C++17
#include <vector>
void FiveDigitsOfPi(std::vector<unsigned int>& result)
{
result = { 3, 1, 4, 1, 5 };
}
```
%% Cell type:markdown id: tags:
The second version works as you intend, but it way clunkier to use: do you prefer:
%% Cell type:code id: tags:
``` C++17
{
auto digits = FiveDigitsOfPi();
}
```
%% Cell type:markdown id: tags:
or:
%% Cell type:code id: tags:
``` C++17
{
std::vector<unsigned int> digits;
FiveDigitsOfPi(digits);
}
```
%% Cell type:markdown id: tags:
In fact, you shouldn't worry: all modern compilers provide a __return value optimization__ which guarantees never to copy the potentially large object created.
However, it does work only when the object is returned by value, so casting it as a rvalue reference with `std::move(ret)` actually prevents this optimization to kick up!
%% Cell type:markdown id: tags:
So to put in a nutshell, you should (almost) never use `std::move` on a return line (you may learn more about it in [this StackOverflow question](https://stackoverflow.com/questions/12953127/what-are-copy-elision-and-return-value-optimization)).
The only exception is detailed in item 25 of \cite{Meyers2015} and is very specific: it is when you want to return a value that was passed by an rvalue argument, e.g.:
%% Cell type:code id: tags:
``` C++17
class Matrix; // forward declaration - don't bother yet!
Matrix Add(Matrix&& lhs, const Matrix& rhs)
{
lhs += rhs;
return std::move(lhs); // ok in this case!
}
```
%% Cell type:markdown id: tags:
This case is very limited (never needed it myself so far) so I invite you to read the item in Scott Meyer's book in you want to learn more (items 23 to 30 are really enlightening about move semantics - very recommended reading!).
%% Cell type:markdown id: tags:
## Move constructors
In classes, C++ introduced with move semantics two additional elements in the canonical form of the class:
- A **move constructor**
- A **move assignment operator**
%% Cell type:code id: tags:
``` C++17
#include <cstring>
#include <iostream>
class Text2
{
public :
friend void swap(Text2& lhs, Text2& rhs);
Text2(const char* string);
// Copy constructor.
Text2(const Text2& t);
// Move constructor
Text2(Text2&& t);
// Recopy operator; defined here due to an issue of Xeus-cling with operators
Text2& operator=(const Text2& t)
{
std::cout << "Operator= called" << std::endl;
if (this == &t)
return *this ; // standard idiom to deal with auto-recopy
delete [] data_;
size_ = t.size_ ;
data_ = new char[t.size_] ;
std::copy(t.data_, t.data_ + size_, data_);
return *this ;
}
// Move assignment operator; defined here due to an issue of Xeus-cling with operators
Text2& operator=(Text2&& t)
{
std::cout << "Operator= called for r-value" << std::endl;
if (this == &t)
return *this;
delete[] data_;
size_ = t.size_;
data_ = t.data_;
// Don't forget to properly invalidate `t` content:
t.size_ = 0 ;
t.data_ = nullptr ;
return *this ;
}
~Text2();
// Overload of operator<<, defined here due to an issue of Xeus-cling with operators.
friend std::ostream & operator<<(std::ostream& stream, const Text2& t)
{
return stream << t.data_ ;
}
private :
unsigned int size_{0};
char* data_ = nullptr;
} ;
```
%% Cell type:code id: tags:
``` C++17
Text2::Text2(const char* string)
{
std::cout << "Constructor called" << std::endl;
size_ = std::strlen(string) + 1;
data_ = new char[size_] ;
std::copy(string, string + size_, data_);
}
```
%% Cell type:code id: tags:
``` C++17
Text2::Text2(const Text2& t)
: size_(t.size_), data_(new char [t.size_])
{
std::cout << "Copy constructor called" << std::endl;
std::copy(t.data_, t.data_ + size_, data_);
}
```
%% Cell type:code id: tags:
``` C++17
Text2::Text2(Text2&& t)
: size_(t.size_), data_(t.data_)
{
std::cout << "Move constructor called" << std::endl;
t.size_ = 0 ;
t.data_ = nullptr ;
}
```
%% Cell type:code id: tags:
``` C++17
Text2::~Text2()
{
std::cout << "Destructor called" << std::endl;
delete[] data_;
}
```
%% Cell type:code id: tags:
``` C++17
{
Text2 t1("world!") ;
Text2 t2("Hello") ;
// Swap of values:
Text2 tmp = std::move(t1);
t1 = std::move(t2);
t2 = std::move(tmp);
std::cout << t1 << " " << t2 << std::endl;
}
```
%% Cell type:markdown id: tags: