Commit 20df7997 authored by GILLES Sebastien's avatar GILLES Sebastien
Browse files

Cleaning-up part 5 (still in progress).

parent 9da9b895
%% Cell type:markdown id: tags:
# [Getting started in C++](/) - [Useful concepts and STL](/notebooks/5-UsefulConceptsAndSTL/0-main.ipynb) - [RAII idiom](/notebooks/5-UsefulConceptsAndSTL/2-RAII.ipynb)
# [Getting started in C++](/) - [Useful concepts and STL](./0-main.ipynb) - [RAII idiom](./2-RAII.ipynb)
%% Cell type:markdown id: tags:
<h1>Table of contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Introduction" data-toc-modified-id="Introduction-1">Introduction</a></span></li><li><span><a href="#Example:-dynamic-array" data-toc-modified-id="Example:-dynamic-array-2">Example: dynamic array</a></span></li></ul></div>
%% Cell type:markdown id: tags:
## Introduction
This chapter is one of the most important of this tutorial: it is an idiom without which the most common critic against C++ is totally justified!
Often, people who criticizes the language say C++ is really tricky and that is extremely easy to leak memory all over the way, and that it sorely miss a [**garbage collector**](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)) which does the job of cleaning-up and freeing the memory when the data are no longer used.
However, garbage collection, used for instance in Python and Java, is not without issues itself: the memory is not always freed as swiftly as possible, and the bookkeeping of references is not without a toll on the efficiency of the program itself.
However, garbage collection, used for instance in Python and Java, is not without issues itself: the memory is not always freed as swiftly as possible, and the bookkeeping of references is not without a toll on the efficiency of the program.
C++ provides in fact the best of both worlds: a way to provide safe freeing of memory as soon as possible... provided you know how to adequately use it.
The **Ressource Acquisition Is Initialization** or **RAII** is the key mechanism for this: the idea is just to use an object with:
* The constructor in charge of allocating the ressources (memory, mutexes, etc...)
* The destructor in charge of freeing all that as soon as the object becomes out-of-scope.
And that's it!
%% Cell type:markdown id: tags:
## Example: dynamic array
%% Cell type:code id: tags:
``` C++17
#include <string>
#include <iostream>
class Array
{
public:
Array(std::string name, std::size_t dimension);
~Array();
private:
std::string name_;
double* array_ = nullptr;
};
```
%% Cell type:code id: tags:
``` C++17
Array::Array(std::string name, std::size_t dimension)
: name_(name)
{
std::cout << "Acquire ressources for " << name_ << std::endl;
array_ = new double[dimension];
for (auto i = 0ul; i < dimension; ++i)
array_[i] = 0.;
}
```
%% Cell type:code id: tags:
``` C++17
Array::~Array()
{
std::cout << "Release ressources for " << name_ << std::endl;
delete[] array_;
}
```
%% Cell type:code id: tags:
``` C++17
{
Array array1("Array 1", 5);
{
Array array2("Array 2", 2);
{
Array array3("Array 3", 2);
}
Array array4("Array 4", 4);
}
Array array5("Array 5", 19);
}
```
%%%% Output: stream
Acquire ressources for Array 1
Acquire ressources for Array 2
Acquire ressources for Array 3
Release ressources for Array 3
Acquire ressources for Array 4
Release ressources for Array 4
Release ressources for Array 2
Acquire ressources for Array 5
Release ressources for Array 5
Release ressources for Array 1
%% Cell type:markdown id: tags:
Of course, don't use such a class: STL `std::vector` and `std::array` are already there for that (and use up RAII principle under the hood!) and provide also more complicated mechanisms such as the copy.
The ressource itself needs not be memory; for instance `std::ofstream` also use up RAII: its destructor calls `close()` if not done manually before, ensuring the file on disk features properly the changes you might have done on it during the run of your program.
%% Cell type:markdown id: tags:
© _CNRS 2016_ - _Inria 2018-2019_
© _CNRS 2016_ - _Inria 2018-2020_
_This notebook is an adaptation of a lecture prepared by David Chamont (CNRS) under the terms of the licence [Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0)](http://creativecommons.org/licenses/by-nc-sa/4.0/)_
_The present version has been written by Sébastien Gilles and Vincent Rouvreau (Inria)_
......
%% Cell type:markdown id: tags:
# [Getting started in C++](/) - [Useful concepts and STL](/notebooks/5-UsefulConceptsAndSTL/0-main.ipynb) - [Containers](/notebooks/5-UsefulConceptsAndSTL/3-Containers.ipynb)
# [Getting started in C++](/) - [Useful concepts and STL](./0-main.ipynb) - [Containers](./3-Containers.ipynb)
%% Cell type:markdown id: tags:
<h1>Table of contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Introduction" data-toc-modified-id="Introduction-1">Introduction</a></span></li><li><span><a href="#std::vector" data-toc-modified-id="std::vector-2"><code>std::vector</code></a></span><ul class="toc-item"><li><span><a href="#Allocator-template-parameter" data-toc-modified-id="Allocator-template-parameter-2.1">Allocator template parameter</a></span></li><li><span><a href="#Most-used-constructors" data-toc-modified-id="Most-used-constructors-2.2">Most used constructors</a></span></li><li><span><a href="#Size" data-toc-modified-id="Size-2.3">Size</a></span></li><li><span><a href="#Adding-new-elements" data-toc-modified-id="Adding-new-elements-2.4">Adding new elements</a></span></li><li><span><a href="#Direct-access:-operator[]-and-at()" data-toc-modified-id="Direct-access:-operator[]-and-at()-2.5">Direct access: <code>operator[]</code> and <code>at()</code></a></span></li><li><span><a href="#Under-the-hood:-storage-and-capacity" data-toc-modified-id="Under-the-hood:-storage-and-capacity-2.6">Under the hood: storage and capacity</a></span></li><li><span><a href="#reserve()-and-resize()" data-toc-modified-id="reserve()-and-resize()-2.7"><code>reserve()</code> and <code>resize()</code></a></span></li><li><span><a href="#std::vector-as-a-C-array" data-toc-modified-id="std::vector-as-a-C-array-2.8"><code>std::vector</code> as a C array</a></span></li><li><span><a href="#Iterators" data-toc-modified-id="Iterators-2.9">Iterators</a></span></li></ul></li><li><span><a href="#Incrementing-/-decrementing-iterators" data-toc-modified-id="Incrementing-/-decrementing-iterators-3">Incrementing / decrementing iterators</a></span></li><li><span><a href="#Other-containers" data-toc-modified-id="Other-containers-4">Other containers</a></span></li></ul></div>
%% Cell type:markdown id: tags:
## Introduction
Containers are the standard answer to a very common problem: how to store a collection of homogeneous data, while ensuring the kind of safety RAII provides.
In this chapter, I won't deal with **associative containers** - which will be handled in the [very next chapter](/notebooks/5-UsefulConceptsAndSTL/4-AssociativeContainers.ipynb).
## `std::vector`
The container of choice, which I haven't resisted using a little in previous examples so far...
### Allocator template parameter
Its full prototype is:
````
template
<
class T,
class Allocator = std::allocator<T>
> class vector;
````
where the second template argument provides the way the memory is allocated. Most of the time the default value is ok and therefore in use you often have just the type stored within, e.g. `std::vector<double>`.
### Most used constructors
%% Cell type:markdown id: tags:
* Empty constructors: no element inside.
%% Cell type:code id: tags:
``` C++17
#include <vector>
{
std::vector<double> bar;
}
```
%% Cell type:markdown id: tags:
* Constructors with default number of elements. The elements are the default-constructed ones in this case.
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<double> bar(3);
for (auto item : bar)
std::cout << item << std::endl;
}
```
%%%% Output: stream
0
0
0
%% Cell type:markdown id: tags:
* Constructors with default number of elements and a default value.
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<double> bar(3, 4.3);
for (auto item : bar)
std::cout << item << std::endl;
}
```
%%%% Output: stream
4.3
4.3
4.3
%% Cell type:markdown id: tags:
* Since C++ 11, constructor with the initial content.
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
for (auto item : foo)
std::cout << item << std::endl;
}
```
%%%% Output: stream
3
5
6
%% Cell type:markdown id: tags:
* And of course copy and move constructions
* And of course copy (and move) constructions
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::vector<int> bar { foo };
for (auto item : bar)
std::cout << item << std::endl;
}
```
%%%% Output: stream
3
5
6
%% Cell type:markdown id: tags:
### Size
A useful perk is that in true object paradigm, `std::vector` knows its size at every moment (in C with dynamic arrays you needed to keep track of the size independantly: the array was actually a pointer which indicates where the array started, but absolutely not when it ended.).
The method to know it is `size()`:
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "Size = " << foo.size() << std::endl;
}
```
%%%% Output: stream
Size = 3
%% Cell type:markdown id: tags:
### Adding new elements
`std::vector` provides an easy and (most of the time) cheap to add an element **at the end of the array**. The method to add a new element is `push_back`:
`std::vector` provides an easy and (most of the time) cheap way to add an element **at the end of the array**. The method to add a new element is `push_back`:
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "Size = " << foo.size() << std::endl;
foo.push_back(7);
std::cout << "Size = " << foo.size() << std::endl;
}
```
%%%% Output: stream
Size = 3
Size = 4
%% Cell type:markdown id: tags:
There is also an `insert()` method to add an element anywhere, but it is not very efficient (see capacity below).
%% Cell type:markdown id: tags:
### Direct access: `operator[]` and `at()`
`std::vector` provides a direct access to an element through an index (that is not true for all containers) with the `operator[]`:
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "foo[1] = " << foo[1] << std::endl;
std::cout << "foo[1] = " << foo[1] << std::endl; // Remember: indexing starts at 0 in C and C++
}
```
%%%% Output: stream
foo[1] = 5
%% Cell type:markdown id: tags:
Direct access is not checked: if you go beyond the size of the vector you enter undefined behaviour territory:
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "foo[4] = " << foo[4] << std::endl; // undefined territory
}
```
%%%% Output: stream
foo[4] = 0
%% Cell type:markdown id: tags:
A specific method `at()` exists that performs the adequate check and thrown an exception if needed:
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "foo[4] = " << foo.at(4) << std::endl; // exception thrown
}
```
%%%% Output: stream
foo[4] =
%%%% Output: error
Standard Exception: vector
%% Cell type:markdown id: tags:
I do not necessarily recommend it: I would rather check the index is correct with an `assert`, which provides the runtime check in debug mode only and doesn't slow down the code in release mode.
%% Cell type:markdown id: tags:
### Under the hood: storage and capacity
In practice, `std::vector` is a dynamic array allocated with safety through the use of RAII.
To make `push_back` a O(1) operation most of the time, slightly more memory than what you want to use is allocated.
The `capacity()` must not be mistaken for the `size()`:
* `size()` is the number of elements in the array and might be of use for the end-user.
* `capacity()` is more internal: it is the underlying memory the compiler allocated for the container, which is a bit larger to make room for few new elements.
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<std::size_t> foo;
for (auto i = 0ul; i < 10ul; ++i)
{
std::cout << "Vector: size = " << foo.size() << " and capacity = " << foo.capacity() << std::endl;
foo.push_back(i);
}
}
```
%%%% Output: stream
Vector: size = 0 and capacity = 0
Vector: size = 1 and capacity = 1
Vector: size = 2 and capacity = 2
Vector: size = 3 and capacity = 4
Vector: size = 4 and capacity = 4
Vector: size = 5 and capacity = 8
Vector: size = 6 and capacity = 8
Vector: size = 7 and capacity = 8
Vector: size = 8 and capacity = 8
Vector: size = 9 and capacity = 16
%% Cell type:markdown id: tags:
The pattern for capacity is clear here but is not dictated by the standard: it is up to the STL vendor to choose the way it deals with it.
So what's happen when the capacity is reached and a new element is added?
* A new dynamic array with the new capacity is created.
* Each element of the former dynamic array is **copied** (or eventually **moved**) into the new one.
* The former dynamic array is destroyed.
The least we can say is we're far from O(1) here! (and we're with a POD type - copy is cheap, which is not the case for certain types of objects...) So obviously it is better to avoid this operation as much as possible!
### `reserve()` and `resize()`
`reserve()` is the method to set manually the size of the capacity. When you have a clue of the expected number of elements, it is better to provide it: even if your guess was flawed, it limits the number of reallocations:
`reserve()` is the method to set manually the value of the capacity. When you have a clue of the expected number of elements, it is better to provide it: even if your guess was flawed, it limits the number of reallocations:
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<std::size_t> foo;
foo.reserve(5); // 10 would have been better of course!
for (auto i = 0ul; i < 10ul; ++i)
{
std::cout << "Vector: size = " << foo.size() << " and capacity = " << foo.capacity() << std::endl;
foo.push_back(i);
}
}
```
%%%% Output: stream
Vector: size = 0 and capacity = 5
Vector: size = 1 and capacity = 5
Vector: size = 2 and capacity = 5
Vector: size = 3 and capacity = 5
Vector: size = 4 and capacity = 5
Vector: size = 5 and capacity = 5
Vector: size = 6 and capacity = 10
Vector: size = 7 and capacity = 10
Vector: size = 8 and capacity = 10
Vector: size = 9 and capacity = 10
%% Cell type:markdown id: tags:
It must not be mistaken with `resize()`, which changes the size of the meaningful content of the dynamic array.
%% Cell type:code id: tags:
``` C++17
#include <iostream>
#include <string>
// Utility to print the content of a non-associative container.
// Don't bother with it now: it uses up iterators we'll see a bit below.
template
<
class VectorT
>
void PrintVector(const VectorT& vector,
std::string separator = ", ", std::string opener = "[", std::string closer = "]\n")
{
auto size = vector.size();
std::cout << "Size = " << size << " Capacity = " << vector.capacity() << " Content = ";
std::cout << opener;
auto it = vector.cbegin();
auto end = vector.cend();
static_cast<void>(end); // to avoid compilation warning in release mode
for (decltype(size) i = 0u; i + 1u < size; ++it, ++i)
{
assert(it != end);
std::cout << *it << separator;
}
if (size > 0u)
std::cout << *it;
std::cout << closer;
}
```
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<std::size_t> foo { 3, 5};
PrintVector(foo);
foo.resize(8, 10); // Second optional argument gives the values to add.
PrintVector(foo);
foo.resize(3, 15);
PrintVector(foo);
}
```
%%%% Output: stream
Size = 2 Capacity = 2 Content = [3, 5]
Size = 8 Capacity = 8 Content = [3, 5, 10, 10, 10, 10, 10, 10]
Size = 3 Capacity = 8 Content = [3, 5, 10]
%% Cell type:markdown id: tags:
As you see, `resize()` may increase or decrease the size of the `std::vector`; if it decreases it some values are lost.
You may see as well the capacity is not adapted consequently; you may use `shrink_to_fit()` method to tell the program to reduce the capacity but it is not binding and the compiler may not do so (it does here):
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<std::size_t> foo { 3, 5};
PrintVector(foo);
foo.resize(8, 10); // Second optional argument gives the values to add.
PrintVector(foo);
foo.resize(3, 10);
PrintVector(foo);
foo.shrink_to_fit();
PrintVector(foo);
}
```
%%%% Output: stream
Size = 2 Capacity = 2 Content = [3, 5]
Size = 8 Capacity = 8 Content = [3, 5, 10, 10, 10, 10, 10, 10]
Size = 3 Capacity = 8 Content = [3, 5, 10]
Size = 3 Capacity = 3 Content = [3, 5, 10]
%% Cell type:markdown id: tags:
As a rule:
* When you use `reserve`, it often means you intend to add new content with `push_back()` which increases the size by 1 (and the capacity would be unchanged provided you estimated the argument given to reserve well).
* When you use `resize`, you intend to modify on the spot the values in the container, with for instance `operator[]`, a loop or iterators.
A common mistake is to mix up unduly both:
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<int> five_pi_digits;
five_pi_digits.resize(5);
five_pi_digits.push_back(3);
five_pi_digits.push_back(1);
five_pi_digits.push_back(4);
five_pi_digits.push_back(1);
five_pi_digits.push_back(5);
for (auto item : five_pi_digits)
std::cout << "Digit = " << item << std::endl; // Not what we intended!
}
```
%%%% Output: stream
Digit = 0
Digit = 0
Digit = 0
Digit = 0
Digit = 0
Digit = 3
Digit = 1
Digit = 4
Digit = 1
Digit = 5
%% Cell type:markdown id: tags:
### `std::vector` as a C array
In your code, you might at some point use a C library which deals with dynamic array. If the function doesn't mess with the structure of the dynamic array (by reallocating the content for instance), you may use without any issue a `std::vector` through its method `data()`
**NOTE:** Xeus-cling does not support yet C printf; so you may try this [@Coliru](https://coliru.stacked-crooked.com/a/f55f1e11833594aa).
%% Cell type:code id: tags:
``` C++17
#include <cstdio>
// A C function