Mentions légales du service

Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found
Select Git revision
  • master
  • v19.05
  • v20.03
  • v21.05
  • v22.10
5 results

Target

Select target project
  • formations/cpp/gettingstartedwithmoderncpp
  • sbenamor/gettingstartedwithmoderncpp
  • steff/gettingstartedwithmoderncpp
  • sgilles/gettingstartedwithmoderncpp
  • vrouvrea/gettingstartedwithmoderncpp
  • fvergnet/gettingstartedwithmoderncpp
  • jediaz/gettingstartedwithmoderncpp
  • mmalanda/gettingstartedwithmoderncpp
  • bnguyenv/gettingstartedwithmoderncpp
9 results
Select Git revision
  • 112_binder
  • 112_binder_support
  • 112_mr
  • 113_jupytext
  • 113_jupytext_dot_binder
  • 113_jupytext_postbuild
  • 113_jupytext_postbuild_pyproject
  • 124_decl_def
  • 124_test_compilation_ok
  • 126_zotero
  • 129_rephrasing
  • 134_virtual_calls
  • algorithm_rereading
  • ci_nbstripout
  • cppyy
  • develop
  • hands_on
  • jupytext
  • jupytext_in_postbuild
  • jupytext_pyproject_mac
  • master
  • miscellaneous_small_fixes
  • object_rereading
  • procedural_rereading
  • rereading_operators
  • rereading_template_notebooks
  • sebastien_note_2024_day2
  • rc24.03.4
  • rc24.03.5
  • rc24.03.6
  • rc24.03.7
31 results
Show changes
Showing
with 708 additions and 425 deletions
......@@ -212,16 +212,15 @@
],
"metadata": {
"kernelspec": {
"display_name": "C++17",
"language": "C++17",
"name": "xcpp17"
"display_name": "Cppyy",
"language": "c++",
"name": "cppyy"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"codemirror_mode": "c++",
"file_extension": ".cpp",
"mimetype": "text/x-c++src",
"name": "c++",
"version": "17"
"name": "c++"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [Templates](./0-main.ipynb) - [Metaprogramming](./4-Metaprogramming.ipynb)
%% Cell type:markdown id: tags:
## Introduction
We will no go very far in this direction: metaprogramming is really a very rich subset of C++ of its own, and one that is especially tricky to code.
You can be a very skilled C++ developer and never use this; however it opens some really interesting prospects that can't be achieved easily (or at all...) without it.
I recommend the reading of [Modern C++ design](../bibliography.ipynb#Modern-C++-Design) to get the gist of it: even if it relies upon older versions of C++ (and therefore some of its hand-made constructs are now in one form or another in modern C++ or STL) it is very insightful to understand the reasoning behind metaprogrammation.
## Example: same action upon a collection of heterogeneous objects
Let's say we want to put together in a same container multiple objects of heterogeneous type (one concrete case for which I have used that: reading an input data file with each entry is handled differently by a dedicated object).
In C++11, `std::tuple` was introduced for that purpose:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <tuple>
std::tuple<int, std::string, double, float, long> tuple =
std::make_tuple(5, "hello", 5., 3.2f, -35l);
```
%% Cell type:markdown id: tags:
What if we want to apply the same treatment to all of the entries?
%% Cell type:markdown id: tags:
In more usual containers, we would just write a `for` loop, but this is not an option here: to access the `I`-th element of the tuple syntax is `std::get<I>`, so `I` has to be known at compiled time... which is not the case for the mutable variable used in a typical `for` loop!
%% Cell type:markdown id: tags:
Let's roll with just printing each of them:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
{
std::cout << std::get<0>(tuple) << std::endl;
std::cout << std::get<1>(tuple) << std::endl;
std::cout << std::get<2>(tuple) << std::endl;
std::cout << std::get<3>(tuple) << std::endl;
std::cout << std::get<4>(tuple) << std::endl;
}
```
%% Cell type:markdown id: tags:
The treatment was rather simple here, but the code was duplicated manually for each of them. **Metaprogramming** is the art of making the compiler generate by itself the whole code.
The syntax relies heavily on templates, and those of you familiar with functional programming will feel at ease here:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
template <std::size_t IndexT, std::size_t TupleSizeT>
struct PrintTuple
{
template<class TupleT> // I'm lazy I won't put there std::tuple<int, std::string, double, float, long> again...
static void Do(const TupleT& tuple)
{
std::cout << std::get<IndexT>(tuple) << std::endl;
PrintTuple<IndexT + 1ul, TupleSizeT>::Do(tuple); // that's the catch: call recursively the next one!
}
};
```
%% Cell type:markdown id: tags:
A side reminder here: we need to use a combo of template specialization of `struct` and static method here to work around the fact template specialization of functions is not possible (see [here](2-Specialization.ipynb#Mimicking-the-partial-template-specialization-for-functions) for more details)
%% Cell type:markdown id: tags:
You may see the gist of it, but there is still an issue: the recursivity goes to the infinity... (don't worry your compiler will yell before that!). So you need a specialization to stop it - you may see now why I used a class template and not a function!
%% Cell type:code id: tags:
``` C++17
``` c++
template <std::size_t TupleSizeT>
struct PrintTuple<TupleSizeT, TupleSizeT>
{
template<class TupleT>
static void Do(const TupleT& tuple)
{
// Do nothing!
}
};
```
%% Cell type:markdown id: tags:
With that, the code is properly generated:
%% Cell type:code id: tags:
``` C++17
``` c++
{
std::tuple<int, std::string, double, float, long> tuple =
std::make_tuple(5, "hello", 5., 3.2f, -35l);
PrintTuple<0, std::tuple_size<decltype(tuple)>::value>::Do(tuple);
}
```
%% Cell type:markdown id: tags:
Of course, the call is not yet very easy: it's cumbersome to have to explicitly reach the tuple size... But as often in C++ an extra level of indirection may lift this issue:
%% Cell type:code id: tags:
``` C++17
``` c++
template<class TupleT>
void PrintTupleWrapper(const TupleT& t)
{
PrintTuple<0, std::tuple_size<TupleT>::value>::Do(t);
}
```
%% Cell type:markdown id: tags:
And then the call may be simply:
%% Cell type:code id: tags:
``` C++17
``` c++
{
std::tuple<int, std::string, double, float, long> tuple = std::make_tuple(5, "hello", 5., 3.2f, -35l);
PrintTupleWrapper(tuple);
}
```
%% Cell type:markdown id: tags:
In fact, my laziness earlier when I used a template argument rather than the exact tuple type pays now as this function may be used with any tuple (or more precisely with any tuple for which all elements comply with `operator<<`):
%% Cell type:code id: tags:
``` C++17
``` c++
{
int a = 5;
std::tuple<std::string, int*> tuple = std::make_tuple("Hello", &a);
PrintTupleWrapper(tuple);
}
```
%% Cell type:markdown id: tags:
## Slight improvement with C++ 17 `if constexpr`
With C++ 17 compile-time check `if constexpr`, you may even do the same with much less boilerplate:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
template <std::size_t IndexT, class TupleT>
void PrintTupleIfConstexpr(const TupleT& tuple)
{
constexpr auto size = std::tuple_size<TupleT>();
static_assert(IndexT <= size);
if constexpr (IndexT < size)
{
std::cout << std::get<IndexT>(tuple) << std::endl;
PrintTupleIfConstexpr<IndexT + 1, TupleT>(tuple);
}
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
template<class TupleT>
void PrintTupleIfConstexprWrapper(const TupleT& tuple)
{
PrintTupleIfConstexpr<0ul>(tuple);
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
std::tuple<int, std::string, double, float, long> tuple = std::make_tuple(5, "hello", 5., 3.2f, -35l);
PrintTupleIfConstexprWrapper(tuple);
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
int a = 5;
std::tuple<std::string, int*> tuple = std::make_tuple("Hello", &a);
PrintTupleIfConstexprWrapper(tuple);
}
```
%% Cell type:markdown id: tags:
The gist of it remains the same (it amounts to a recursive call) but the compile-time check makes us avoid entirely the use of the stopping specialization and the use of a struct with `static` method.
%% Cell type:markdown id: tags:
## `std::apply`
Another option provided by C++ 17 is to use `std::apply`, which purpose is to apply upon all elements of a same tuple a same operation.
[Cppreference](https://en.cppreference.com/w/cpp/utility/apply) provides a snippet that solves the exact problem we tackled above in a slightly different way: they wrote a generic overload of `operator<<` for any instance of a `std::tuple` object.
I have simplified somehow their snippet as they complicated a bit the reading with unrelated niceties to handle the comma separators.
Don't bother if you do not understand all of it:
- The weird `...` syntax is for variadic templates, that we will present [briefly in next notebook](../5-MoreAdvanced.ipynb#Variadic-templates). Sorry we avoid as much as possible to refer to future stuff in the training session, but current paragraph is a late addition and doesn't mesh completely well with the structure of this document. You just have to know it's a way to handle a variable number of arguments (here template arguments of `std::tuple`).
%% Cell type:code id: tags:
``` C++17
``` c++
template<typename... Ts>
std::ostream& operator<<(std::ostream& os, std::tuple<Ts...> const& theTuple)
{
std::apply
(
[&os](Ts const&... tupleArgs)
{
((os << tupleArgs << std::endl), ...);
}, theTuple
);
return os;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
std::cout << tuple << std::endl;
```
%% Cell type:markdown id: tags:
# Bonus: metaprogramming Fibonacci
In [notebook about constexpr](../1-ProceduralProgramming/7-StaticAndConstexpr.ipynb), I said implementing Fibonacci series before C++ 11 involved metaprogramming; here is an implementation (much more wordy than the `constexpr` one):
%% Cell type:code id: tags:
``` C++17
``` c++
template<std::size_t N>
struct Fibonacci
{
static std::size_t Do()
{
return Fibonacci<N-1>::Do() + Fibonacci<N-2>::Do();
}
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
// Don't forget the specialization for 0 and 1!
template<>
struct Fibonacci<0ul>
{
static std::size_t Do()
{
return 0ul;
}
};
template<>
struct Fibonacci<1ul>
{
static std::size_t Do()
{
return 1ul;
}
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
std::cout << Fibonacci<5ul>::Do() << std::endl;
std::cout << Fibonacci<10ul>::Do() << std::endl;
```
%% Cell type:markdown id: tags:
And if the syntax doesn't suit you... you could always add an extra level of indirection to remove the `::Do()` part:
%% Cell type:code id: tags:
``` C++17
``` c++
template<std::size_t N>
std::size_t FibonacciWrapper()
{
return Fibonacci<N>::Do();
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
std::cout << FibonacciWrapper<5ul>() << std::endl;
std::cout << FibonacciWrapper<10ul>() << std::endl;
```
%% Cell type:markdown id: tags:
As you can see, in some cases `constexpr` really alleviates some tedious boilerplate...
It should be noticed that although these computations really occur at compile time, they aren't nonetheless recognized automatically as `constexpr`:
%% Cell type:code id: tags:
``` C++17
``` c++
constexpr auto fibo_5 = FibonacciWrapper<5ul>(); // COMPILATION ERROR!
```
%% Cell type:markdown id: tags:
To fix that, you need to declare `constexpr`:
- Each of the `Do` static method (`static constexpr std::size_t Do()`)
- The `FibonacciWrapper` function (`template<std::size_t N> constexpr std::size_t FibonacciWrapper()`)
So in this specific case you should really go with the much less wordy and more expressive expression with `constexpr` given in [aforementioned notebook](../1-ProceduralProgramming/7-StaticAndConstexpr.ipynb)
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [Templates](./0-main.ipynb) - [Hints to more advanced concepts with templates](./5-MoreAdvanced.ipynb)
%% Cell type:markdown id: tags:
We have barely scratched the surface of what can be done with templates; I will here just drop few names and a very brief explanation to allow you to dig deeper if it might seem of interest for your codes (a Google search for either of them will give you plenty of references) and also avoid you frowing upon a seemingly daunting syntax...
## Curiously recurrent template pattern (CRTP)
One of my own favourite idiom (so much I didn't resist writing an [entry](../7-Appendix/Crtp.ipynb) about it in the appendix).
The idea behind it is to provide a same set of a given functionality to classes that have otherwise nothing in common.
The basic example is if you want to assign a unique identifier to a class of yours: the implementation would be exactly the same in each otherwise different class in which you need this:
* Initializing properly this identifier at construction.
* Check no other objects of the same class use it already.
* Provide an accessor `GetUniqueIdentifier()`.
Usual inheritance or composition aren't very appropriate to put in common once and for all (DRY principle!) these functionalities: either they may prove dangerous (inheritance) or be very wordy (composition).
The **curiously recurrent template pattern** is a very specific inheritance:
```c++
class MyClass : public UniqueIdentifier<MyClass>
```
where your class inherits from a template class which parameter is... your class itself.
%% Cell type:markdown id: tags:
## Traits
A **trait** is a member of a class which gives exclusively an information about type. For instance let's go back to the `HoldAValue` class we wrote [earlier](./2-Specialization.ipynb) in our template presentation:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include <string>
template<class T>
class HoldAValue
{
public:
HoldAValue(T value);
T GetValue() const;
private:
T value_;
};
template<class T>
HoldAValue<T>::HoldAValue(T value)
: value_(value)
{ }
template<class T>
T HoldAValue<T>::GetValue() const
{
return value_;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
HoldAValue<int> hint(5);
std::cout << hint.GetValue() << std::endl;
HoldAValue<std::string> sint("Hello world!");
std::cout << sint.GetValue() << std::endl;
}
```
%% Cell type:markdown id: tags:
This class was not especially efficient: the accessor `GetValue()`:
- Requires `T` is copyable.
- Copy `T`, which is potentially a time-consuming operator.
We could replace by `const T& GetValue() const` to solve both those issues, but it's a bit on the nose (and less efficient) for plain old data type. The best of both world may be achieved by a trait:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include <string>
#include <type_traits> // for std::conditional, std::is_trivial
template<class T>
class ImprovedHoldAValue
{
public:
// Traits: information about type!
using return_value =
typename std::conditional<std::is_trivial<T>::value, T, const T&>::type;
ImprovedHoldAValue(T value);
return_value GetValue() const;
// For the tutorial purpose!
return_value GetValueAlternateSyntax() const;
private:
T value_;
};
template<class T>
ImprovedHoldAValue<T>::ImprovedHoldAValue(T value)
: value_(value)
{ }
```
%% Cell type:markdown id: tags:
Beware the trait that acts as the return value must be scoped correctly in the definition:
%% Cell type:code id: tags:
``` C++17
``` c++
template<class T>
typename ImprovedHoldAValue<T>::return_value ImprovedHoldAValue<T>::GetValue() const
{
return value_;
}
```
%% Cell type:markdown id: tags:
(unless you use the alternate syntax for function):
(unless you use the alternate syntax for function, which you should definitely consider!):
%% Cell type:code id: tags:
``` C++17
// Doesn't work in Xeus-cling but completely valid in a real environment
``` c++
template<class T>
auto ImprovedHoldAValue<T>::GetValue() const -> return_value
auto ImprovedHoldAValue<T>::GetValueAlternateSyntax() const -> return_value
{
return value_;
}
```
%% Cell type:markdown id: tags:
And the result remain the same, albeit more efficient as a copy is avoided:
%% Cell type:code id: tags:
``` C++17
``` c++
{
ImprovedHoldAValue<int> hint(5);
std::cout << hint.GetValue() << std::endl;
ImprovedHoldAValue<std::string> sint("Hello world!");
std::cout << sint.GetValue() << std::endl;
}
```
%% Cell type:markdown id: tags:
A more complete example with an uncopyable class and the proof the alternate syntax works is available [@Coliru](https://coliru.stacked-crooked.com/a/698eb583b5a94bc7).
We can even roll up a class which is uncopyable to check the constant reference is properly used:
%% Cell type:code id: tags:
``` c++
#include <iostream>
class UncopyableObject
{
public:
explicit UncopyableObject(double value);
UncopyableObject(const UncopyableObject& rhs) = default;
UncopyableObject(UncopyableObject&& rhs) = default;
UncopyableObject& operator=(const UncopyableObject&) = delete;
UncopyableObject& operator=(UncopyableObject&&) = delete;
void Print(std::ostream& out) const;
private:
double value_;
};
std::ostream& operator<<(std::ostream& out, const UncopyableObject& object);
```
%% Cell type:code id: tags:
``` c++
%%cppmagics cppyy/cppdef
UncopyableObject::UncopyableObject(double value)
: value_{value}
{ }
```
%% Cell type:code id: tags:
``` c++
void UncopyableObject::Print(std::ostream& out) const
{
out << value_;
}
```
%% Cell type:code id: tags:
``` c++
std::ostream& operator<<(std::ostream& out, const UncopyableObject& object)
{
object.Print(out);
return out;
}
```
%% Cell type:code id: tags:
``` c++
ImprovedHoldAValue<UncopyableObject> object(UncopyableObject(10.));
std::cout << object.GetValue() << std::endl;
```
%% Cell type:markdown id: tags:
In fact sometimes you may even have **traits class**: class which sole purpose is to provide type information! Such classes are often used as template parameters of other classes.
%% Cell type:markdown id: tags:
## Policies
Policies are a way to provide a class for which a given aspect is entirely configurable by another class you provide as a template parameter.
STL uses up policies: for instance there is a second optional template parameter to `std::vector` which deals with the way to allocate the memory (and only with that aspect). So you may provide your own way to allocate the memory and provide it to `std::vector`, which will use it instead of its default behaviour. [Modern C++ design](../bibliography.ipynb#Modern-C++-Design) dedicates a whole chapter of his book to this example: he wrote an allocator aimed at being more efficient for the allocation of small objects.
The syntax of a policy is a template class which also derives from at least one of its template parameter:
%% Cell type:code id: tags:
``` C++17
``` c++
template<class ColorPolicyT>
class Car : public ColorPolicyT
{ };
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
struct Blue
{
void Print() const
{
std::cout << "My color is blue!" << std::endl;
}
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <string>
// Let's assume in the future a car provides a mechanism to change its color at will:
class Changing
{
public:
void Display() const // I do not use `Print()` intentionally to illustrate there is no constraint
// but in true code it would be wise to use same naming scheme!
{
std::cout << "Current color is " << color_ << "!" << std::endl;
}
void ChangeColor(const std::string& new_color)
{
color_ = new_color;
}
private:
std::string color_ = "white";
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
Car<Blue> blue_car;
blue_car.Print();
Car<Changing> future_car;
future_car.Display();
future_car.ChangeColor("black");
future_car.Display();
}
```
%% Cell type:markdown id: tags:
## Variadic templates
If you have already written some C, you must be familiar with `printf`:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <cstdio>
{
int i = 5;
double d = 3.1415;
printf("i = %d\n", i);
printf("i = %d and d = %lf\n", i, d);
}
```
%% Cell type:markdown id: tags:
This function is atypical as it may take an arbitrary number of arguments. You can devise similar function of your own in C (look for `va_arg` if you insist...) but it was not recommended: under the hood it is quite messy, and limits greatly the checks your compiler may perform on your code (especially regarding the type of the arguments).
C++ 11 introduced **variadic templates**, which provides a much neater way to provide this kind of functionality (albeit with a *very* tricky syntax: check all the `...` below... and it becomes worse if you need to propagate them).
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
// Overload when one value only.
template<class T>
void Print(T value)
{
std::cout << value << std::endl;
}
// Overload with a variadic number of arguments
template<class T, class ...Args>
void Print(T value, Args... args) // args here will be all parameters passed to the function from the
// second one onward.
{
Print(value);
Print(args...); // Will call recursively `Print()` with one less argument.
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
Print(5, "hello", "world", "ljksfo", 3.12);
```
%% Cell type:code id: tags:
``` C++17
``` c++
Print("One");
```
%% Cell type:code id: tags:
``` C++17
``` c++
Print(); // Compilation error: no arguments isn't accepted!
```
%% Cell type:markdown id: tags:
To learn more about them, I recommend [Effective Modern C++](../bibliography.ipynb#Effective-Modern-C++), which provides healthy explanations about `std::forward` and `std::move` you will probably need soon if you want to use these variadic templates.
%% Cell type:markdown id: tags:
## Template template parameters (not a mistake...)
You may want to be way more specific when defining a template parameter: instead of telling it might be whatever you want, you may impose that a specific template parameter should only be a type which is itself an instantiation of a template.
Let's consider a very dumb template function which purpose is to call print the value of `size()` for a STL container. We'll see them more extensively in a [dedicated notebook](../5-UsefulConceptsAndSTL/3-Containers.ipynb), but for now you just have to know that these containers take two template parameters:
- One that describe the type inside the container (e.g. `double` for `std::vector<double>`).
- Another optional one which specifies how the memory is allocated.
We could not bother and use directly a usual template parameter:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
template<class ContainerT>
void PrintSize1(const ContainerT& container)
{
std::cout << container.size() << std::endl;
}
```
%% Cell type:markdown id: tags:
You may use it on seamlessly on usual STL containers:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <list>
#include <deque>
{
std::vector<double> vector { 3.54, -73.1, 1004. };
std::list<int> list { 15, -87, 12, 12, 0, -445 };
std::deque<unsigned int> deque { 2, 87, 95, 14, 451, 10, 100, 1000 };
PrintSize1(vector);
PrintSize1(list);
PrintSize1(deque);
}
```
%% Cell type:markdown id: tags:
However, it would also work for any class that define a `size()` parameters, regardless of its nature.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <string>
struct NonTemplateClass
{
std::string size() const
{
return "Might seem idiotic, but why not?";
}
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
template<class U, class V, class W>
struct TemplateWithThreeParameters
{
int size() const
{
return -99;
}
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
NonTemplateClass non_template_class;
TemplateWithThreeParameters<int, float, double> template_with_three_parameters;
PrintSize1(non_template_class);
PrintSize1(template_with_three_parameters);
}
```
%% Cell type:markdown id: tags:
We see here with my rather dumb example that `PrintSize1()` also works for my own defined types, that are not following the expected prototype of a STL container (class with two template parameters).
It may seem pointless in this example, but the worst is that the method might be used to represent something entirely different from what we expect when we call `size()` upon a STL container.
A possibility to limit the risk is to use a **template template parameter** in the function definition:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
template<template <class, class> class ContainerT, class TypeT, class AllocatorT>
void PrintSize2(const ContainerT<TypeT, AllocatorT>& container)
{
std::cout << container.size() << std::endl;
}
```
%% Cell type:markdown id: tags:
By doing so, we impose that the type of the argument is an instantiation of a class with two arguments. With that, STL containers work:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <list>
#include <deque>
{
std::vector<double> vector { 3.54, -73.1, 1004. };
std::list<int> list { 15, -87, 12, 12, 0, -445 };
std::deque<unsigned int> deque { 2, 87, 95, 14, 451, 10, 100, 1000 };
// At call site, you don't have to specify the template arguments that are inferred.
PrintSize2(vector);
PrintSize2(list);
PrintSize2(deque);
}
```
%% Cell type:markdown id: tags:
whereas my own defined types don't:
%% Cell type:code id: tags:
``` C++17
``` c++
{
NonTemplateClass non_template_class;
TemplateWithThreeParameters<int, float, double> template_with_three_parameters;
PrintSize2(non_template_class);
PrintSize2(template_with_three_parameters);
}
```
%% Cell type:markdown id: tags:
In practice you shouldn't need to use that too often, but in the context of this notebook it is worth knowing that the possibility exists (it may help you understand an error message should you use a library using them). I had to resort to them a couple of times, especially along policies.
If you want to learn more about them, you should really read [Modern C++ design](../bibliography.ipynb#Modern-C++-Design).
Of course, `concept` introduced in C++ 20 are a much more refined tool to check template parameter type fulfills some constraints, but template template parameter are a much older feature that worked way back to pre-C++ 11 versions of the language.
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
......@@ -34,16 +34,15 @@
],
"metadata": {
"kernelspec": {
"display_name": "C++17",
"language": "C++17",
"name": "xcpp17"
"display_name": "Cppyy",
"language": "c++",
"name": "cppyy"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"codemirror_mode": "c++",
"file_extension": ".cpp",
"mimetype": "text/x-c++src",
"name": "c++",
"version": "17"
"name": "c++"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [Useful concepts and STL](./0-main.ipynb) - [Error handling](./1-ErrorHandling.ipynb)
%% Cell type:markdown id: tags:
## Introduction
It is very important of course to be able to track and manage as nicely as possible when something goes south in your code. We will see in this chapter the main ways to provide such insurance.
## Compiler warnings and errors
The first way to find out possible errors are during compilation time: you may ensure your code is correct by making its compilation fails if not (that's exactly the spirit of the example we provided for [template template parameter](../4-Templates/5-MoreAdvanced.ipynb#Template-template-parameters-(not-a-mistake...))). There are many ways to do so, even more so if templates are involved; here are few of them we have already seen:
* `static_assert` we saw in [template introduction](../4-Templates/1-Intro.ipynb#static_assert)
* Duck typing failure: if a template argument used somewhere doesn't comply with the expected API. If you're using C++ 20, consider using `concept` to restrain what is provided as template argument.
* Locality of reference: use heavily blocks so that a variable is freed as soon as possible. This way, you will avoid mistakes of using a variable that is in fact no longer up-to-date.
* Strive to make your code without any compiler warning: if there are even as less as 10 warnings, you might not see an eleventh that might sneak its way at some point. Activate as many types of warnings as possible for your compiler, and deactivate those unwanted with care (see [this notebook](../6-InRealEnvironment/4-ThirdParty.ipynb) to see how to manage third party warnings.).
## Assert
`assert` is a very handy tool that your code behaves exactly as expected. `assert` takes one argument; if this argument is resolved to `false` the code aborts with an error message:
%% Cell type:code id: tags:
``` C++17
#undef NDEBUG // Don't bother with this outside of Xeus-cling!
``` c++
#include <cassert>
#include <iostream>
// THIS CODE WILL KILL Xeus-cling kernel!
{
double* ptr = nullptr;
assert(ptr != nullptr && "Pointer should be initialized first!");
//< See the `&&`trick above: you can't provide a message like in `static_assert`,
// but you may use a AND condition and a string to detail the issue
// (it works because the string is evaluated as `true`).
std::cout << *ptr << std::endl;
}
```
%% Cell type:markdown id: tags:
(here in Xeus-cling it breaks the kernel; you may check on [Coliru](https://coliru.stacked-crooked.com/a/5fb74d5ae0118cb2))
The perk of `assert` is that it checks the condition is `true` *only in debug mode*!
So you may get extensive tests in debug mode that are ignored once your code has been thoroughly checked and is ready for production use (_debug_ and _release_ mode will be explained in a [later notebook](../6-InRealEnvironment/3-Compilers.ipynb#Debug-and-release-flags)).
The example above is a very useful use: before dereferencing a pointer checks it is not `nullptr` (hence the good practice to always initialize a pointer to `nullptr`...)
In **release mode**, the macro `NDEBUG` should be defined and all the `assert` declarations will be ignored by the compiler.
I recommend to use `assert` extensively in your code:
* You're using a pointer? Check it is not `nullptr`.
* You get in a function a `std::vector` and you know it should be exactly 3 elements long? Fire up an assert to check this...
* A `std::vector` is expected to be sorted a given way? Check it through an `assert`... (yes it's a O(n) operation, but if your contract is broken you need to know it!)
Of course, your debug mode will be much slower; but its role is anyway to make sure your code is correct, not to be the fastest possible (release mode is there for that!)
%% Cell type:markdown id: tags:
## Exceptions
%% Cell type:markdown id: tags:
Asserts are clearly a **developer** tool: they are there to signal something does not behave as intended and therefore that there is a bug somewhere...
However, they are clearly not appropriate to handle an error of your program end-user: for instance if he specifies an invalid input file, you do not want an `abort` which is moreover handled only in debug mode!
### `throw`
There is an **exception** mechanism that is appropriate to deal with this; this mechanism is activated with the keyword `throw`.
%% Cell type:code id: tags:
``` C++17
``` c++
%%cppmagics clang
// Kernel yields a weird output so we're better off using compiler directly
#include <cstdlib>
#include <iostream>
void FunctionThatExpectsSingleDigitNumber(int n);
void FunctionThatExpectsSingleDigitNumber(int n)
{
if (n < -9)
throw -1;
if (n > 9)
throw 1;
std::cout << "Valid digit is " << n << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
int main([[maybe_unused]] int argc, [[maybe_unused]] char** argv)
{
FunctionThatExpectsSingleDigitNumber(5);
std::cout << "End" << std::endl;
}
```
std::cout << "After call 5" << std::endl;
%% Cell type:code id: tags:
``` C++17
{
FunctionThatExpectsSingleDigitNumber(15);
std::cout << "End" << std::endl;
std::cout << "After call 15" << std::endl;
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
As you can see, an exception provokes an early exit of the function: the lines after the exception is thrown are not run, and unless it is caught it will stop at the abortion of the program.
%% Cell type:markdown id: tags:
### `try`/`catch`
`throw` expects an object which might be intercepted by the `catch` command if the exception occurred in a `try` block; `catch` is followed by a block in which something may be attempted (or not!)
%% Cell type:code id: tags:
``` C++17
``` c++
%%cppmagics clang
// Kernel yields a weird output so we're better off using compiler directly
#include <cstdlib>
#include <iostream>
void FunctionThatExpectsSingleDigitNumber(int n);
void FunctionThatExpectsSingleDigitNumber(int n)
{
if (n < -9)
throw -1;
if (n > 9)
throw 1;
std::cout << "Valid digit is " << n << std::endl;
}
int main([[maybe_unused]] int argc, [[maybe_unused]] char** argv)
{
try
{
FunctionThatExpectsSingleDigitNumber(15);
}
catch(int n)
{
if (n == 1)
std::cerr << "Error: value is bigger than 9!" << std::endl;
if (n == -1)
std::cerr << "Error: value is less than -9!" << std::endl;
}
std::cout << "End" << std::endl;
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
If the type doesn't match, the exception is not caught; if you want to be sure to catch everything you may use the `...` syntax. The drawback with this syntax is you can't use the thrown object information:
%% Cell type:code id: tags:
``` C++17
``` c++
%%cppmagics clang
// Kernel yields a weird output so we're better off using compiler directly
#include <cstdlib>
#include <iostream>
void FunctionThatExpectsSingleDigitNumber(int n);
void FunctionThatExpectsSingleDigitNumber(int n)
{
if (n < -9)
throw -1;
if (n > 9)
throw 1;
std::cout << "Valid digit is " << n << std::endl;
}
int main([[maybe_unused]] int argc, [[maybe_unused]] char** argv)
{
try
{
FunctionThatExpectsSingleDigitNumber(15);
}
catch(float n) // doesn't catch your `int` exception!
{
std::cerr << "Float case: " << n << " was provided and is not an integer" << std::endl;
}
catch(...)
{
std::cerr << "Gluttony case... but no object to manipulate to extract more information!" << std::endl;
}
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
### Re-throw
Once an exception has been caught by a `catch` block, it is considered to be handled; the code will therefore go on to what is immediately after the block. If you want to throw the exception again (for instance after logging a message) you may just type `throw`:
%% Cell type:code id: tags:
``` C++17
// No re-throw
``` c++
%%cppmagics clang
// Kernel yields a weird output so we're better off using compiler directly
#include <cstdlib>
#include <iostream>
void FunctionThatExpectsSingleDigitNumber(int n);
void FunctionThatExpectsSingleDigitNumber(int n)
{
if (n < -9)
throw -1;
if (n > 9)
throw 1;
std::cout << "Valid digit is " << n << std::endl;
}
int main([[maybe_unused]] int argc, [[maybe_unused]] char** argv)
{
try
{
FunctionThatExpectsSingleDigitNumber(15);
}
catch(int n)
{
std::cerr << "Int case: " << n << " not a single digit number" << std::endl;
}
std::cout << "After catch" << std::endl;
return EXIT_SUCCESS;
}
```
%% Cell type:code id: tags:
``` C++17
// Rethrow
``` c++
%%cppmagics clang
// Kernel yields a weird output so we're better off using compiler directly
#include <cstdlib>
#include <iostream>
void FunctionThatExpectsSingleDigitNumber(int n);
void FunctionThatExpectsSingleDigitNumber(int n)
{
if (n < -9)
throw -1;
if (n > 9)
throw 1;
std::cout << "Valid digit is " << n << std::endl;
}
int main([[maybe_unused]] int argc, [[maybe_unused]] char** argv)
{
try
{
FunctionThatExpectsSingleDigitNumber(15);
}
catch(int n)
{
std::cerr << "Int case: " << n << " not a single digit number" << std::endl;
throw; // `throw n` would have been correct as well but is not necessary
throw; // the only difference with previous cell! Note `n` is not needed here.
}
std::cout << "After catch" << std::endl;
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
### Good practice: use as much as possible exceptions that derive from `std::exception`
Using the _catch all_ case is not recommended in most cases... In fact even the `int`/`float` case is not that smart: it is better to use an object with information about why the exception was raised in the first place.
It is advised to use exception classes derived from the `std::exception` one; this way you provide a catch all without the drawback mentioned earlier. This class provides a virtual `what()` method which gives away more intel about the issue:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <exception>
struct TooSmallError : public std::exception
{
virtual const char* what() const noexcept override // we'll go back to `noexcept` later...
{
return "Value is less than -9!";
}
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
struct TooBigError : public std::exception
{
virtual const char* what() const noexcept override
{
return "Value is more than 9!";
}
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
void FunctionThatExpectsSingleDigitNumber2(int n)
{
if (n < -9)
throw TooSmallError();
if (n > 9)
throw TooBigError();
std::cout << "Valid digit is " << n << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
try
{
FunctionThatExpectsSingleDigitNumber2(15);
}
catch(const std::exception& e)
{
std::cerr << "Properly caught: " << e.what() << std::endl;
}
}
```
%% Cell type:markdown id: tags:
The information comes now with the exception object, which is much better...
%% Cell type:markdown id: tags:
Unfortunately, you are not always privy to the choice of deriving from `std::exception`: if for instance you're using [Boost library](https://www.boost.org) the exception class they use don't inherit from `std::exception` (but some derived ones such as `boost::filesystem::error` do...). In this case, make sure to foresee to catch them with a dedicated block:
%% Cell type:code id: tags:
``` C++17
// Pseudo-code - Do not run in Xeus-cling!
``` c++
// Pseudo-code - Do not run in notebook!
try
{
...
}
catch(const std::exception& e)
{
...
}
catch(const boost::exception& e)
{
...
}
```
%% Cell type:markdown id: tags:
### Storing more information in the class... and avoiding the `char*` pitfall!
In fact we could have gone even further and personnalize the exception message, for instance by printing for which value of `n` the issue arose:
%% Cell type:code id: tags:
``` C++17
``` c++
struct TooBigErrorWithMessage : public std::exception
{
TooBigErrorWithMessage(int n);
virtual const char* what() const noexcept override
{
return msg_.c_str(); // c_str() as we need a const char*, not a std::string!
}
private:
std::string msg_;
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <sstream>
TooBigErrorWithMessage::TooBigErrorWithMessage(int n)
{
std::ostringstream oconv;
oconv << "Value '" << n << "' is more than 9!";
msg_ = oconv.str();
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
void FunctionThatExpectsSingleDigitNumber3(int n)
{
if (n < -9)
throw TooSmallError();
if (n > 9)
throw TooBigErrorWithMessage(n);
std::cout << "Valid digit is " << n << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
try
{
FunctionThatExpectsSingleDigitNumber3(15);
}
catch(const std::exception& e)
{
std::cerr << "Properly caught: " << e.what() << std::endl;
}
}
```
%% Cell type:markdown id: tags:
This might seem trivial here, but in real code it is really handy to get all relevant information from your exception.
However, I avoided silently a common pitfall when dabbling with `std::exception`: one of its cardinal sin for me at least is to use C string as return type for its `what()` method. It might seem innocuous enough, but is absolutely not if you do not use a `std::string` as a data attribute to encapsulate the message.
However, I avoided silently a common pitfall when dabbling with `std::exception`: one of its cardinal sin is to use C string as return type for its `what()` method. It might seem innocuous enough, but is absolutely not if you do not use a `std::string` as a data attribute to encapsulate the message.
Let's write it without the `msg_` data attribute:
%% Cell type:code id: tags:
``` C++17
``` c++
%%file /tmp/cell.cpp
#include <cstdlib>
#include <exception>
#include <sstream>
#include <iostream>
#include <string>
// ===========================
// Declarations
// ===========================
struct TooBigErrorWithMessagePoorlyImplemented : public std::exception
{
TooBigErrorWithMessagePoorlyImplemented(int n)
: n_(n)
{ }
TooBigErrorWithMessagePoorlyImplemented(int n);
virtual const char* what() const noexcept override
{
std::ostringstream oconv;
oconv << "Value '" << n_ << "' is more than 9!";
std::string msg = oconv.str();
std::cout << "\nCheck: message is |" << msg << "|" << std::endl;
return msg.c_str();
}
virtual const char* what() const noexcept override;
private:
int n_;
};
```
%% Cell type:code id: tags:
// We skip negative case for readability's sake
void FunctionThatExpectsSingleDigitNumberWithPoorlyImplementedException(int n);
``` C++17
#include <iostream>
void FunctionThatExpectsSingleDigitNumberWithPoorlyImplementedException(int n)
// ===========================
// Definitions
// ===========================
TooBigErrorWithMessagePoorlyImplemented::TooBigErrorWithMessagePoorlyImplemented(int n)
: n_{n}
{ }
const char* TooBigErrorWithMessagePoorlyImplemented::what() const noexcept
{
// if (n < -9)
// throw TooSmallError(); // skip it - we will make the kernel crash and it's better not to bother reloading this one each time
std::ostringstream oconv;
oconv << "Value '" << n_ << "' is more than 9!";
std::string msg = oconv.str();
std::cout << "\nCheck: message is |" << msg << "|" << std::endl;
return msg.c_str();
}
void FunctionThatExpectsSingleDigitNumberWithPoorlyImplementedException(int n)
{
if (n > 9)
throw TooBigErrorWithMessagePoorlyImplemented(n);
std::cout << "Valid digit is " << n << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
// ===========================
// Main
// ===========================
int main([[maybe_unused]] int argc, [[maybe_unused]] char** argv)
{
try
{
FunctionThatExpectsSingleDigitNumberWithPoorlyImplementedException(15);
}
catch(const std::exception& e)
{
std::cerr << "Properly caught: " << e.what() << std::endl;
}
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
You might wonder why we have just chosen to write the content in a file rather than executing the cell?
The reason is that both our usual ways to run the cell (either using Cppyy kernel directly or defering to the local compiler) yield very obscure error messages that completely hide what goes wrong (even if there is a very helpful warning which pinpoints the issue).
For once, let's go to the terminal and compile and run from there the program (please don't bother about the `-W` options - we'll get you covered in part 6):
%% Cell type:markdown id: tags:
**In a terminal**
*Compile the program*
```shell
clang++ -std=c++20 -Weverything -Wno-c++98-compat -Wno-c++98-compat-pedantic -Wno-newline-eof -Wno-padded /tmp/cell.cpp -o poorly_implemented_exception
```
*Run the program*
```shell
./poorly_implemented_exception
```
%% Cell type:code id: tags:
``` c++
You should get something like:
```
%% Cell type:markdown id: tags:
```shell
Properly caught:
Check: message is |Value '15' is more than 9!|
```
that may be followed by gibberish characters that vary from one call to another (or not - sometimes the program will work seemingly fine...)|
%% Cell type:markdown id: tags:
So what happens here?
The deal is that `c_str()` returns the pointer to the underlying data used in the `std::string` object... which got destroyed at the end of `what()` function...
So we're directly in the realm of undefined behaviour (sometimes it might be kernel crash, sometimes it might be partial or complete gibberish printed instead of the expected string).
The ordeal would of course be the same with another type (for instance if instead of using `std::string` you allocate manually a `char*` variable): as soon as you get out of scope the variable is destroyed and behaviour is erratic.
So when you define an exception class you can't define the return of `what()` method inside the implementation of the method itself; you **must** use a data attribute to store it. The most common choice is to use a `std::string`.
And please notice that your compiler was very helpful with its warning message:
%% Cell type:markdown id: tags:
```shell
/tmp/cell.cpp:41:12: warning: address of stack memory associated with local variable 'msg' returned [-Wreturn-stack-address]
```
%% Cell type:markdown id: tags:
### Good practice: be wary of a forest of exception classes
At first sight, it might be tempting to provide a specific exception whenever you want to throw one: this way, you are able to catch only this one later on.
In practice, it's not necessarily such a good idea:
* When the code becomes huge, you (and even more importantly a new developer) may be lost in all the possible exceptions.
* It is rather time consuming: defining a specific exception means a bit of boilerplate to write, and those minutes might have been spent more efficiently, as...
* Most of the time, you don't even need the filtering capacity; in my code for instance if an exception is thrown it is 99 % of the time to be caught in the `main()` function to terminate properly the execution.
The only case in which it might be very valuable to use a tailored exception is for your integration tests: if you are writing a test in which an exception is expected, it is better to check the exception you caught is exactly the one that was expected and not a completely unrelated exception which was thrown for another reason.
STL provides many derived class from `std::exception` which you might use directly or as base of your own class; see [cppreference](https://en.cppreference.com/w/cpp/error/exception) for more details. [OpenClassrooms](https://openclassrooms.com/fr/courses/7137751-programmez-en-oriente-objet-avec-c/7532931-gerez-des-erreurs-avec-les-exceptions) (in french) sorted out the go-to exceptions for lazy developers which cover most of the cases (don't get me wrong: laziness is often an asset for a software developer!):
* `std::domain_error`
* `std::invalid_argument`
* `std::length_error`
* `std::out_of_range`
* `std::logic_error`
* `std::range_error`
* `std::overflow_error`
* `std::underflow_error`
* `std::runtime_error`
with the latter being the default choice if no other fit your issue. Most of those classes provide a `std::string` argument in its constructor so that you may explain exactly what went wrong.
### `noexcept`
Exceptions are in fact very subtle to use; see for instance [Herb Sutter's books](../bibliography.ipynb#Exceptional-C++-/-More-Exceptional-C++) that deal with them extensively (hence their title!).
In C++03, it was possible to specify a method or a function wouldn't throw, but the underlying mechanism with keywords `throw` and `nothrow` was such a mess many C++ gurus warned against using them.
In C++11, they tried to rationalize it and a new keyword to replace them was introduced: `noexcept`.
In short, if you have a method you're 100 % percent sure can't throw an exception, add this suffix and the compiler may optimize even further. However, do not put it if an exception can be thrown: it would result in a ugly runtime crash should an exception be raised there... (and up to now compilers are completely oblivious to that: no associated warning is displayed).
As you saw, in recent C++ `what()` is to be a `noexcept` method. It is therefore a bad idea to try to allocate there the string to be returned: allocation and string manipulation could lead to an exception from the STL functions used.
FYI, currently the error messages provided by compilers when your runtime crash due to poorly placed `noexcept` may look like:
%% Cell type:markdown id: tags:
clang++:
```shell
libc++abi: terminating due to uncaught exception of type **your exception**
```
g++:
```shell
terminate called after throwing an instance of **your exception**
154: what(): Exception found
```
%% Cell type:markdown id: tags:
They're not great: it's not obvious the issue stems from a call happening where it shouldn't, and they do not give a lot of information to where you should look to fix it. The best is therefore to be extremely cautious before marking a function as `noexcept`. However, use it when you can (see item 14 of [Effective modern C++](../bibliography.ipynb#Effective-Modern-C++) for incentives to use it).
%% Cell type:markdown id: tags:
### Good practice: never throw an exception from a destructor
The explanation is quite subtle and explained in detail in item 8 of [Effective C++](../bibliography.ipynb#Effective-C++-/-More-Effective-C++); however just know you should never throw an exception there. If you need to deal with an error there, use something else (`std::abort` for instance).
### The exception class I use
I (Sébastien) provide in [appendix](../7-Appendix/HomemadeException.ipynb) my own exception class (which of course derives from `std::exception`) which provides additionally:
* A constructor with a string, to avoid defining a verbosy dedicated exception class for each case.
* Better management of the string display, with an underlying `std::string` object.
* Information about the location from where the exception was thrown.
Vincent uses the STL exceptions described in [the previous section](#Good-practice:-be-wary-of-a-forest-of-exception-classes).
## Error codes
A quick word about C-style error management which you might find in use in some libraries: **error codes**.
The principe of the error codes is that your functions and methods should return an `int` which provides an indication of the success or not of the call; the eventual values sought are returned from reference. For instance:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <type_traits>
constexpr auto INVALID_TYPE = -1;
```
%% Cell type:code id: tags:
``` C++17
``` c++
template<class T>
int AbsoluteValue(T value, T& result)
{
if constexpr (!std::is_arithmetic<T>())
return INVALID_TYPE;
else
{
if (value < 0)
result = -value;
else
result = value;
return EXIT_SUCCESS;
}
}
```
%% Cell type:markdown id: tags:
I don't like these error codes much, because:
* The result can't be naturally given in return value and must be provided in argument.
* You have to bookkeep the possible error codes somewhere, and a user must know this somewhere to go consult them if something happens (usually a header file: see for instance one for [PETSc library](https://www.mcs.anl.gov/petsc/petsc-master/include/petscerror.h.html)).
* In the libraries that use them, more often than not some are not self descriptive and you have to figure out what the hell the issue is.
* And more importantly, this relies on the end-user thinking to check the error value:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <string>
#include <iostream>
{
std::string hello { "Hello world" };
std::string absolute_str { "not modified at all by function call..." };
int negative { -5 };
int absolute_int { };
AbsoluteValue(negative, absolute_int);
std::cout << "Absolute value for integer is " << absolute_int << std::endl;
AbsoluteValue(hello, absolute_str); // No compilation or runtime error (or even warning)!
std::cout << "Absolute value for string is " << absolute_str << std::endl;
}
```
%% Cell type:markdown id: tags:
It should be noticed C++ 11 introduced a dedicated class to handle more gracefully error codes: [`std::error_code`](https://en.cppreference.com/w/cpp/error/error_code). I have no direct experience with it but it looks promising as illustrated by this [blog post](https://akrzemi1.wordpress.com/2017/07/12/your-own-error-code/).
%% Cell type:markdown id: tags:
### nodiscard
The point about forgetting to check the value may however be mitigated since C++17 with the attribute [``nodiscard``](https://en.cppreference.com/w/cpp/language/attributes/nodiscard), which helps your compiler figure out the return value should have been checked.
%% Cell type:code id: tags:
``` C++17
``` c++
%%cppmagics clang
#include <cstdlib>
#include <string>
constexpr auto INVALID_TYPE = -1;
template<class T>
[[nodiscard]] int AbsoluteValueNoDiscard(T value, T& result)
{
if constexpr (!std::is_arithmetic<T>())
return INVALID_TYPE;
else
{
if (value < 0)
result = -value;
else
result = value;
return EXIT_SUCCESS;
}
}
```
%% Cell type:code id: tags:
``` C++17
#include <string>
int main([[maybe_unused]] int argc, [[maybe_unused]] char** argv)
{
std::string hello("Hello world");
std::string absolute_value = "";
AbsoluteValueNoDiscard(hello, absolute_value); // Now there is a warning! But only available after C++ 17...
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
......@@ -211,16 +211,15 @@
],
"metadata": {
"kernelspec": {
"display_name": "C++17",
"language": "C++17",
"name": "xcpp17"
"display_name": "Cppyy",
"language": "c++",
"name": "cppyy"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"codemirror_mode": "c++",
"file_extension": ".cpp",
"mimetype": "text/x-c++src",
"name": "c++",
"version": "17"
"name": "c++"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [Useful concepts and STL](./0-main.ipynb) - [RAII idiom](./2-RAII.ipynb)
%% Cell type:markdown id: tags:
## Introduction
This chapter is one of the most important in this tutorial: it is an idiom without which the most common critic against C++ is totally justified!
Often, people who criticizes the language say C++ is really tricky and that is extremely easy to leak memory all over the place, and that it sorely misses a [**garbage collector**](https://en.wikipedia.org/wiki/Garbage_collection_(computer_science)) which does the job of cleaning-up and freeing the memory when the data are no longer used.
However, garbage collection, used for instance in Python and Java, is not without issues itself: the memory is not always freed as swiftly as possible, and the bookkeeping of references is not free performance-wise.
C++ provides in fact the best of both worlds: a way to provide safe freeing of memory as soon as possible... provided you know how to adequately use it.
The **Resource Acquisition Is Initialization** or **RAII** idiom is the key mechanism for this; the idea is just to use an object with:
* The constructor in charge of allocating the resources (memory, mutexes, etc...)
* The destructor in charge of freeing all that as soon as the object becomes out-of-scope.
And that's it!
%% Cell type:markdown id: tags:
## Example: dynamic array
%% Cell type:code id: tags:
``` C++17
``` c++
#include <string>
#include <iostream>
class Array
{
public:
Array(std::string name, std::size_t dimension);
~Array();
private:
std::string name_;
double* underlying_array_ = nullptr;
double* underlying_array_ { nullptr };
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
%%cppmagics cppyy/cppdef
Array::Array(std::string name, std::size_t dimension)
: name_(name)
: name_{name}
{
std::cout << "Acquire resources for " << name_ << std::endl;
underlying_array_ = new double[dimension];
for (auto i = 0ul; i < dimension; ++i)
underlying_array_[i] = 0.;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
Array::~Array()
{
std::cout << "Release resources for " << name_ << std::endl;
delete[] underlying_array_;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
Array array1("Array 1", 5);
{
Array array2("Array 2", 2);
{
Array array3("Array 3", 2);
}
Array array4("Array 4", 4);
}
Array array5("Array 5", 19);
}
```
%% Cell type:markdown id: tags:
Of course, don't use such a class: STL `std::vector` and `std::array` are already there for that (and use up RAII principle under the hood!) and provide also more complicated mechanisms such as the copy.
The resource itself needs not be memory; for instance `std::ofstream` also use up RAII: its destructor calls `close()` if not done manually before, ensuring the file on disk features properly the changes you might have done on it during the run of your program.
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [Useful concepts and STL](./0-main.ipynb) - [Containers](./3-Containers.ipynb)
%% Cell type:markdown id: tags:
## Introduction
Containers are the standard answer to a very common problem: how to store a collection of homogeneous data, while ensuring the kind of safety RAII provides.
In this chapter, I won't deal with **associative containers** - which will be handled in the [very next chapter](/notebooks/5-UsefulConceptsAndSTL/4-AssociativeContainers.ipynb).
## `std::vector`
The container of choice, which I haven't resisted using a little in previous examples so far...
### Allocator template parameter
Its full prototype is:
```c++
template
<
class T,
class Allocator = std::allocator<T>
> class vector;
```
where the second template argument provides the way the memory is allocated. Most of the time the default value is ok and therefore in use you often have just the type stored within, e.g. `std::vector<double>`.
### Most used constructors
%% Cell type:markdown id: tags:
* Empty constructors: no element inside.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
{
std::vector<double> bar;
}
```
%% Cell type:markdown id: tags:
* Constructors with default number of elements. The elements are the default-constructed ones in this case.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<double> bar(3);
for (auto item : bar)
std::cout << item << std::endl;
}
```
%% Cell type:markdown id: tags:
* Constructors with default number of elements and a default value.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<double> bar(3, 4.3);
for (auto item : bar)
std::cout << item << std::endl;
}
```
%% Cell type:markdown id: tags:
* Since C++ 11, constructor with the initial content (prior to C++ 11 you had to use an empty constructor and then add all elements one by one with something like `push_back` (see below) or use a third party library such as Boost::Assign).
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
for (auto item : foo)
std::cout << item << std::endl;
}
```
%% Cell type:markdown id: tags:
* And of course copy (and move - that we will present soon...) constructions
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::vector<int> bar { foo };
for (auto item : bar)
std::cout << item << std::endl;
}
```
%% Cell type:markdown id: tags:
### Size
A useful perk is that in true object paradigm, `std::vector` knows its size at every moment (in C with dynamic arrays you needed to keep track of the size independently: the array was actually a pointer which indicates where the array started, but absolutely not when it ended.).
The method to know it is `size()`:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "Size = " << foo.size() << std::endl;
}
```
%% Cell type:markdown id: tags:
### Adding new elements
`std::vector` provides an easy and (most of the time) cheap way to add an element **at the end of the array**. The method to add a new element is `push_back`:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "Size = " << foo.size() << std::endl;
foo.push_back(7);
std::cout << "Size = " << foo.size() << std::endl;
}
```
%% Cell type:markdown id: tags:
There is also an `insert()` method to add an element anywhere, but it is not very efficient (see capacity below).
%% Cell type:markdown id: tags:
### Direct access: `operator[]` and `at()`
`std::vector` provides a direct access to an element through an index (that is not true for all containers) with the `operator[]`:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "foo[1] = " << foo[1] << std::endl; // Remember: indexing starts at 0 in C and C++
}
```
%% Cell type:markdown id: tags:
Direct access is not checked: if you go beyond the size of the vector you enter undefined behaviour territory:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "foo[4] = " << foo[4] << std::endl; // undefined territory
}
```
%% Cell type:markdown id: tags:
A specific method `at()` exists that performs the adequate check and thrown an exception if needed:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<int> foo { 3, 5, 6 };
std::cout << "foo[4] = " << foo.at(4) << std::endl; // exception thrown
}
```
%% Cell type:markdown id: tags:
I do not necessarily recommend it: I would rather check the index is correct with an `assert`, which provides the runtime check in debug mode only and doesn't slow down the code in release mode.
%% Cell type:markdown id: tags:
### Under the hood: storage and capacity
In practice, `std::vector` is a dynamic array allocated with safety through the use of RAII.
To make `push_back` a O(1) operation most of the time, slightly more memory than what you want to use is allocated.
The `capacity()` must not be mistaken for the `size()`:
* `size()` is the number of elements in the array and might be of use for the end-user.
* `capacity()` is more internal: it is the underlying memory area the compiler allocated for the container, which is a bit larger to make room for few new elements.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<std::size_t> foo;
for (auto i = 0ul; i < 10ul; ++i)
{
std::cout << "Vector: size = " << foo.size() << " and capacity = " << foo.capacity() << std::endl;
foo.push_back(i);
}
}
```
%% Cell type:markdown id: tags:
The pattern for capacity is clear here but is not dictated by the standard: it is up to the STL vendor to choose the way it deals with it.
So what's happen when the capacity is reached and a new element is added?
* A new dynamic array with the new capacity is created.
* Each element of the former dynamic array is **copied** (or eventually **moved**) into the new one.
* The former dynamic array is destroyed.
The least we can say is we're far from O(1) here! (and we're with a POD type - copy is cheap, which is not the case for certain types of objects...) So obviously it is better to avoid this operation as much as possible!
%% Cell type:markdown id: tags:
### `reserve()` and `resize()`
`reserve()` is the method to set manually the value of the capacity. When you have a clue of the expected number of elements, it is better to provide it: even if your guess was flawed, it limits the number of reallocations:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<std::size_t> foo;
foo.reserve(5); // 10 would have been better of course!
for (auto i = 0ul; i < 10ul; ++i)
{
std::cout << "Vector: size = " << foo.size() << " and capacity = " << foo.capacity() << std::endl;
foo.push_back(i);
}
}
```
%% Cell type:markdown id: tags:
It must not be mistaken with `resize()`, which changes the size of the meaningful content of the dynamic array.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include <string>
// Helper function to avoid typing endlessly the same lines...
template<class VectorT>
void PrintVector(const VectorT& vector)
{
auto size = vector.size();
std::cout << "Size = " << size << " Capacity = " << vector.capacity() << " Content = [ ";
for (auto item : vector)
std::cout << item << ' ';
std::cout << ']' << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<std::size_t> foo { 3, 5};
PrintVector(foo);
foo.resize(8, 10); // Second optional argument gives the values to add.
PrintVector(foo);
foo.resize(12); // If not specified, a default value is used - here 0 for a POD
// The default value is the same as the one that would be used when constructing
// an element with empty braces - here `std::size_t myvariable {}`;
PrintVector(foo);
foo.resize(3, 15);
PrintVector(foo);
}
```
%% Cell type:markdown id: tags:
As you see, `resize()` may increase or decrease the size of the `std::vector`; if it decreases it some values are lost.
You may see as well the capacity is not adapted consequently; you may use `shrink_to_fit()` method to tell the program to reduce the capacity but it is not binding and the compiler may not do so (it does here):
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<std::size_t> foo { 3, 5};
PrintVector(foo);
foo.resize(8, 10); // Second optional argument gives the values to add.
PrintVector(foo);
foo.resize(3, 10);
PrintVector(foo);
foo.shrink_to_fit();
PrintVector(foo);
}
```
%% Cell type:markdown id: tags:
As a rule:
* When you use `reserve`, it often means you intend to add new content with `push_back()` which increases the size by 1 (and the capacity would be unchanged provided you estimated the argument given to reserve well).
* When you use `resize`, you intend to modify on the spot the values in the container, with for instance `operator[]`, a loop or iterators.
A common mistake is to mix up unduly both:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<int> five_pi_digits;
five_pi_digits.resize(5);
five_pi_digits.push_back(3);
five_pi_digits.push_back(1);
five_pi_digits.push_back(4);
five_pi_digits.push_back(1);
five_pi_digits.push_back(5);
PrintVector(five_pi_digits); // not what we intended!
}
```
%% Cell type:markdown id: tags:
### `std::vector` as a C array
In your code, you might at some point use a C library which deals with dynamic array. If the function doesn't mess with the structure of the dynamic array (by reallocating the content for instance), you may use without any issue a `std::vector` through its method `data()`
%% Cell type:code id: tags:
``` C++17
``` c++
#include <cstdio>
// A C function
void C_PrintArray(double* array, size_t Nelt)
{
if (Nelt > 0ul)
{
printf("[");
for (size_t i = 0ul; i < Nelt - 1; ++i)
printf("%lf, ", array[i]);
printf("%lf]", array[Nelt - 1]);
}
else
printf("[]");
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
{
std::vector<double> cpp_vector { 3., 8., 9., -12.3, -32.35 };
C_PrintArray(cpp_vector.data(), cpp_vector.size());
}
```
%% Cell type:markdown id: tags:
`data()` was introduced in C++ 11; previously you could do the same with equivalent but much less appealing call to the address of the first element:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
{
std::vector<double> cpp_vector { 3., 8., 9., -12.3, -32.35 };
C_PrintArray(&cpp_vector[0], cpp_vector.size());
}
```
%% Cell type:markdown id: tags:
### Iterators
**Iterators** are an useful feature that is less prominent with C++ 11 (albeit still very useful if you use STL algorithm) but needs to be at least acknowledged as under the hood they are still used in the more sexy [`for`](/notebooks/1-ProceduralProgramming/2-Conditions-and-loops.ipynb#New-for-loop) loops now available.
The idea of an iterator is to provide an object to navigate over all (or part of the) items of a container efficiently.
Let's forget for a while our syntactic sugar `for (auto item : container)` and see what our options are:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<double> cpp_vector { 3., 8., 9., -12.3, -32.35 };
const auto size = cpp_vector.size();
for (auto i = 0ul; i < size; ++i)
std::cout << cpp_vector[i] << " ";
}
```
%% Cell type:markdown id: tags:
It may not extremely efficient: at each call to `operator[]`, the program must figure out the element to draw without using the fact it had just fetched the element just in the previous memory location (in practice now compilers are rather smart and figure this out...)
Iterators provides this (possibly) more efficient access:
Iterators provides another way to access same data (which used to be more efficient but now compilers are cleverer and both are equivalent):
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<double> cpp_vector { 3., 8., 9., -12.3, -32.35 };
std::vector<double>::const_iterator end = cpp_vector.cend();
for (std::vector<double>::const_iterator it = cpp_vector.cbegin(); it != end; ++it)
std::cout << *it << " ";
}
```
%% Cell type:markdown id: tags:
It is more efficient and quite verbose; prior to C++ 11 you had to use this nonetheless (just `auto` would simplify greatly the syntax here but it is also a C++11 addition...)
Iterators are *not* pointers, even if they behave really similarly, e.g. they may use the same `*` and `->` syntax (they might be implemented as pointers, but think of it as [private inheritance](../2-ObjectProgramming/6-inheritance.ipynb#IS-IMPLEMENTED-IN-TERMS-OF-relationship-of-private-inheritance) in this case...)
There are several flavors:
* Constant iterators, used here, with which you can only read the value under the iterator.
* Iterators, with which you can also modify it.
* Reverse iterators, to iterate the container from the last to the first (avoid them if you can: they are a bit messy to use and may not be used in every algorithms in which standard iterators are ok...)
There are default values for each container:
* `begin()` points to the very first element of the container.
* `end()` is **after** the last element of the container.
* `cbegin()` is the constant_iterator that does the same job as `begin()`; prior to C++11 it was confusingly named `begin()`.
* `cend()`: you might probably figure it out...
* `rbegin()` points to the very last element of the container.
* `rend()` is **before** the first element of the container.
What is tricky with them is that they may become invalid if some operations are performed in the time being on the container. For instance if the container is extended the iterators become invalid. Therefore, code like:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
{
std::vector<int> vec { 2, 3, 4, 5, 7, 18 };
for (auto item : vec)
{
if (item % 2 == 0)
vec.push_back(item + 2); // don't do that!
}
PrintVector(vec);
}
```
%% Cell type:markdown id: tags:
is undefined behaviour: it might work (did up to 2022; prints gibberish on first try in 2024) but is not robust. Even if it seemingly "works" you may see the iteration is done over the initial vector; the additional values aren't iterated over (we would end up with an infinite loop in this case).
So, the bottom line is you should really separate actions that modify the structure of a container and iterate over it.
## Incrementing / decrementing iterators
As with POD types, there are both a pre- and post-increment available:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<double> cpp_vector { 3., 8., 9., -12.3, -32.35 };
std::vector<double>::const_iterator end = cpp_vector.cend();
for (std::vector<double>::const_iterator it = cpp_vector.cbegin(); it != end; ++it) // pre-increment
std::cout << *it << " ";
std::cout << std::endl;
for (std::vector<double>::const_iterator it = cpp_vector.cbegin(); it != end; it++) // post-increment
std::cout << *it << " ";
}
```
%% Cell type:markdown id: tags:
Without any surprises the result is the same... but the efficiency is absolutely not: post-increment actually makes a copy of the iterator that replaces the former one, whereas pre-increment one just modify the current value. So if you do not care about pre- or post- increment (as in the case above) stick with pre-increment one.
%% Cell type:markdown id: tags:
## Access a container element - Python like syntax
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include <vector>
std::vector<int> v {1, // can be accessed with begin()[0] or end()[-4]
2, // can be accessed with begin()[1] or end()[-3]
3, // can be accessed with begin()[2] or end()[-2]
4 // can be accessed with begin()[3] or end()[-1]
};
std::cout << v.end()[-2] << " - " << v.begin()[1] << std::endl;
// Displays '3 - 2'
// But also some weird pointer value (something like @0x7ffff91e0de0) with Xeus-cling. Just forget about it, it looks like a Xeus-cling bug.
```
%% Cell type:markdown id: tags:
## Other containers
`std::vector` is not the only possible choice; I will present very briefly the other possibilities here:
* `std::list`: A double-linked list: the idea is that each element knows the addresses of the element before and the element after. It might be considered if you need to add often elements at specific locations in the list: adding a new element is just changing 2 pointers and setting 2 new ones. You can't access directly an element by its index with a `std::list`.
* `std::slist`: A single-linked list: similar as a `std::list` except only the pointer to the next element is kept.
* `std::forward_list`: A single-linked list: similar as a `std::list` except only the pointer to the next element is kept. This is a C++ 11 addition.
* `std::deque`: For "double ended queue"; this container may be helpful if you need to store a really huge amount of data that might not fit within a `std::vector`. It might also be of use if you are to add often elements in front on the list: there is `push_front()` method as well as a `push_back` one. Item 18 of [Effective STL](http://localhost:8888/lab/tree/bibliography.ipynb#Effective-STL) recommends using `std::deque` with `bool`: `std::vector<bool>` was an experiment to provide a specific implementation to spare memory when storing booleans that went wrong and should therefore be avoided...
* `std::array`: You should use this one if the number of elements is known at compile time and doesn't change at all, as compiler may provide even more optimizations than for `std::vector` (and your end user can't by mistake modify its size).
* `std::array`: You should use this one if the number of elements is known at compile time and doesn't change at all, as compiler may provide even more optimizations than for `std::vector` (and your end user can't by mistake modify its size). This is a C++ 11 addition.
* `std::string`: Yes, it is actually a container! I will not tell much more about it; just add that it is the sole container besides `std::vector` and `std::array` that ensures contiguous storage.
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
......@@ -93,16 +93,15 @@
],
"metadata": {
"kernelspec": {
"display_name": "C++17",
"language": "C++17",
"name": "xcpp17"
"display_name": "Cppyy",
"language": "c++",
"name": "cppyy"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"codemirror_mode": "c++",
"file_extension": ".cpp",
"mimetype": "text/x-c++src",
"name": "c++",
"version": "17"
"name": "c++"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [Useful concepts and STL](./0-main.ipynb) - [Associative containers](./4-AssociativeContainers.ipynb)
%% Cell type:markdown id: tags:
## Introduction
A `std::vector` can be seen as an association between two types:
* A `std::size_t`
index, which value is in interval [0, size[, that acts as a key.
* The value actually stored.
The `operator[]` might be used to access one of them:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<int> prime { 2, 3, 5, 7, 11, 13, 17, 19 };
auto index = 3ul;
std::cout << "Element which key is " << index << " is " << prime[index] << std::endl;
}
```
%% Cell type:markdown id: tags:
An associative container is an extension: what if we could loosen the constraint upon the key and use something else?
%% Cell type:markdown id: tags:
## `std::map`
### Construction
`std::map` is a list of key/value pairs that is ordered through a relationship imposed on the keys.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <map>
{
std::map<std::string, unsigned int> age_list
{
{ "Alice", 25 },
{ "Charlie", 31 },
{ "Bob", 22 },
};
auto index = "Charlie";
std::cout << "Element which key is " << index << " is " << age_list[index] << std::endl;
}
```
%% Cell type:markdown id: tags:
### Iteration
In this example, we set three people with their age. We may iterate through it; the actual storage of an item is here a `std::pair<std::string, unsigned int>`. We haven't seen `std::pair` so far, but think of it as a `std::tuple` with 2 elements (it existed prior to `std::tuple` in fact).
There are two handy attributes to access the respective first and second element: `first` and `second`.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <map>
#include <iostream>
std::map<std::string, unsigned int> age_list
{
{ "Alice", 25 },
{ "Charlie", 31 },
{ "Bob", 22 },
};
for (const auto& pair : age_list)
std::cout << pair.first << " : " << pair.second << std::endl;
```
%% Cell type:markdown id: tags:
#### C++ 17: Structure binding
C++ 17 introduced an alternate new syntax I like a lot that is called **structure bindings**:
%% Cell type:code id: tags:
``` C++17
``` c++
for (const auto& [person, age] : age_list)
std::cout << person << " : " << age << std::endl;
```
%% Cell type:markdown id: tags:
As you see, the syntax allocates on the fly variable (here references) for the first and second element of the pair, making the code much more expressive.
You may read more on them [here](https://www.fluentcpp.com/2018/06/19/3-simple-c17-features-that-will-make-your-code-simpler/); we will use them again in this notebook.
%% Cell type:markdown id: tags:
### Provide another ordering rule
The output order is not an accident: as I said it is an **ordered** associative container, and the key must provide a relationship. The default one is `std::less` but you might specify another in template arguments:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <map>
#include <iostream>
{
std::map<std::string, unsigned int, std::greater<std::string>> age_list
{
{ "Alice", 25 },
{ "Charlie", 31 },
{ "Bob", 22 },
};
for (const auto& [name, age] : age_list) // structure binding!
std::cout << name << " : " << age << std::endl;
}
```
%% Cell type:markdown id: tags:
### insert()
You may insert another element later with `insert()`:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <map>
#include <iostream>
{
std::map<std::string, unsigned int> age_list
{
{ "Alice", 25 },
{ "Charlie", 31 },
{ "Bob", 22 },
};
age_list.insert({"Dave", 44});
age_list.insert({"Alice", 32});
for (const auto& [name, age] : age_list)
std::cout << name << " : " << age << std::endl;
}
```
%% Cell type:markdown id: tags:
See here that Dave was correctly inserted... but Alice was unchanged!
In fact `insert` returns a pair:
* First is an iterator to the newly inserted element, or to the position of the one that made the insertion fail.
* Second is a boolean that returns `true` if the insertion worked.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <map>
#include <iostream>
std::map<std::string, unsigned int> age_list
{
{ "Alice", 25 },
{ "Charlie", 31 },
{ "Bob", 22 },
};
{
auto result = age_list.insert({"Dave", 44});
if (!result.second)
std::cerr << "Insertion of Dave failed" << std::endl;
}
{
auto result = age_list.insert({"Alice", 32});
if (!result.second)
std::cerr << "Insertion of Alice failed" << std::endl;
}
for (const auto& [name, age] : age_list)
std::cout << name << " : " << age << std::endl;
```
%% Cell type:markdown id: tags:
Or even better with structure bindings:
%% Cell type:code id: tags:
``` C++17
// Since 2022 seems to fail with Xeus-Cling, but it is perfectly code I heartily recommend over the more clunky notation above.
``` c++
%%cppmagics cppyy/cppdef
const auto& [iterator, was_properly_inserted] = age_list.insert({"Alice", 32});
```
%% Cell type:code id: tags:
``` c++
if (!was_properly_inserted)
std::cerr << "Insertion of Alice failed" << std::endl;
```
%% Cell type:markdown id: tags:
That's something I dislike in this very useful class: error handling is not up to my taste as you have to remember to check explicitly all went right... (this is the discussion we had previously about error codes all over again...)
### Access to one element: don't use `operator[]`!
And this is not the sole example: let's look for an element in a map:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <map>
#include <iostream>
{
std::map<std::string, unsigned int> age_list
{
{ "Alice", 25 },
{ "Charlie", 31 },
{ "Bob", 22 },
};
std::cout << "Alice : " << age_list["Alice"] << std::endl;
std::cout << "Erin : " << age_list["Erin"] << std::endl;
std::cout << "========" << std::endl;
for (const auto& [person, age] : age_list)
std::cout << person << " : " << age << std::endl;
}
```
%% Cell type:markdown id: tags:
So if you provide a wrong key, it doesn't yell and instead creates a new entry on the spot, filling the associated value with the default constructor for the type...
To do it properly (but more verbose!), use the `find()` method (if you're intrigued by the use of iterator there, we will present them more in details in the notebook about [algorithms](./7-Algorithms.ipynb)):
%% Cell type:code id: tags:
``` C++17
``` c++
#include <map>
#include <iostream>
{
std::map<std::string, unsigned int> age_list
{
{ "Alice", 25 },
{ "Charlie", 31 },
{ "Bob", 22 },
};
auto it = age_list.find("Alice");
if (it == age_list.cend())
std::cerr << "No Alice found in the listing!" << std::endl;
else
std::cout << "Alice's age is " << it->second << std::endl;
it = age_list.find("Erin");
if (it == age_list.cend())
std::cerr << "No Erin found in the listing!" << std::endl;
else
std::cout << "Erin's age is " << it->second << std::endl;
for (const auto& [name, age] : age_list)
std::cout << name << " : " << age << std::endl;
}
```
%% Cell type:markdown id: tags:
A side note which will be useful to explain later the `std::unordered_map`: search is performed by dichotomy (~O(log N)).
%% Cell type:markdown id: tags:
### Unicity of key
`std::map` is built on the fact a key must be unique.
If you need to enable possible repetition of keys, you should look at `std::multimap` which provides this possibility with slightly different interface (rather obviously `find()` is replaced by methods that returns a range of iterators).
%% Cell type:markdown id: tags:
### Using objects as keys
You may use your own objects as keys, provided that:
* Either you define `operator<` for it. It is really important to grasp that `operator==` **doesn't matter**: even in `find` it is really `operator<` that is used!
* Or provide as template parameter the ordering relationship you intend to use.
**WARNING:** If you're using pointers as keys, make sure to provide an adequate relationship ordering, typically that takes the pointed object relationship. Otherwise from one run to another you might end with different results as the address won't probably be given in the same order...
%% Cell type:markdown id: tags:
## `std::set`
`std::set` is a special case in which you do not associate a value to the key. The interface is roughly the same.
It might be used for instance if you want to keep a list of stuff you have encountered at least once: you don't care about how many times, but you want to know if it was encountered at least once. A `std::vector` would be inappropriate: you would have to look up its whole content before each insertion. With a `std::set` it is already built-in in the class.
## std::unordered_map
## `std::unordered_map`
This is another associative container introduced in C++ 11, with a different trade-off (and closer to a `dict` in Python for instance):
* Access is much more efficient (~O(1), i.e. independent on the number of elements!).
* Memory imprint is bigger.
* Adding new elements is more expensive.
* The result is not ordered, and there are no rules whatsoever: two runs on the same computer might not yield the list in the same order.
The constraint on the key is different too: the key must be **hashable**, meaning that there must be a specialization of `std::hash` for the type used for key. It must also define `operator==`.
STL provides good such **hashing functions** for POD types (and few others like `std::string`); it is not trivial (but still possible - see for instance [The C++ Standard Library: A Tutorial and Reference](../bibliography.ipynb#The-C++-Standard-Library:-A-Tutorial-and-Reference) for a discussion on this topic) to add new ones.
So to put in a nutshell, if your key type is already handled by the STL and you spend more time reading data than inserting new ones, you should really use this type.
Just an additional note: [The C++ Standard Library: A Tutorial and Reference](../bibliography.ipynb#The-C++-Standard-Library:-A-Tutorial-and-Reference) recommends changing the default internal setting of the class for efficiency: there is an internal float value named `max_load_factor` which has a default value of 1; API of the class introduces a mutator to modify it. He says 0.7f or 0.8f is more efficient; I haven't benchmarked and trusted him on this and am using it in my library.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <unordered_map>
{
std::unordered_map<int, double> list;
list.max_load_factor(0.7f);
}
```
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [Useful concepts and STL](./0-main.ipynb) - [Move semantics](./5-MoveSemantics.ipynb)
%% Cell type:markdown id: tags:
## Motivation: eliminate unnecessary deep copies
In many situations, unnecessary deep copies are made.
In the example below, during the exchange between the two instances of the `Text` class, we have to make 3 memory deallocations, 3 allocations, 3 character copy loops... where 3 pointer copies would be sufficient.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <cstring>
#include <iostream>
class Text
{
public :
// For next section - don't bother yet
friend void Swap(Text& lhs, Text& rhs);
Text(const char* string);
// Copy constructor.
Text(const Text& t);
// Recopy operator; defined here due to an issue of Xeus-cling with operators
Text& operator=(const Text& t)
{
std::cout << "Operator= called" << std::endl;
if (this == &t)
return *this ; // standard idiom to deal with auto-recopy
delete [] data_;
size_ = t.size_ ;
data_ = new char[t.size_] ;
std::copy(t.data_, t.data_ + size_, data_);
return *this ;
}
// Recopy operator.
Text& operator=(const Text& t);
~Text();
// Overload of operator<<, defined here due to an issue of Xeus-cling with operators.
friend std::ostream & operator<<(std::ostream& stream, const Text& t)
{
return stream << t.data_ ;
}
// Overload of operator<<.
friend std::ostream & operator<<(std::ostream& stream, const Text& t);
private :
unsigned int size_{0};
char* data_ { nullptr }; // to make our point - in a true code use an existing container!
} ;
```
%% Cell type:code id: tags:
``` C++17
``` c++
Text::Text(const char* string)
{
std::cout << "Constructor called with argument '" << string << "'" << std::endl;
size_ = std::strlen(string) + 1;
data_ = new char[size_] ;
std::copy(string, string + size_, data_);
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
Text::Text(const Text& t)
: size_(t.size_), data_(new char [t.size_])
{
std::cout << "Copy constructor called" << std::endl;
std::copy(t.data_, t.data_ + size_, data_);
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
Text& Text::operator=(const Text& t)
{
std::cout << "Operator= called" << std::endl;
if (this == &t)
return *this ; // standard idiom to deal with auto-recopy
delete [] data_;
size_ = t.size_ ;
data_ = new char[t.size_] ;
std::copy(t.data_, t.data_ + size_, data_);
return *this ;
}
```
%% Cell type:code id: tags:
``` c++
Text::~Text()
{
std::cout << "Destructor called" << std::endl;
delete[] data_;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
std::ostream & operator<<(std::ostream& stream, const Text& t)
{
return stream << t.data_ ;
}
```
%% Cell type:code id: tags:
``` c++
{
Text t1("world!") ;
Text t2("Hello") ;
// Swap of values:
Text tmp = t1 ;
t1 = t2 ;
t2 = tmp ;
std::cout << t1 << " " << t2 << std::endl;
}
```
%% Cell type:markdown id: tags:
## A traditional answer: to allow the exchange of internal data
By allowing two `Text` objects to exchange (swap) their internal data, we can rewrite our program in a much more economical way in terms of execution time, by leveraging the fact we know the internal structure of the class:
%% Cell type:code id: tags:
``` C++17
``` c++
void Swap(Text& lhs, Text& rhs)
{
unsigned int tmp_size = lhs.size_;
char* tmp_data = lhs.data_;
lhs.size_ = rhs.size_;
lhs.data_ = rhs.data_;
rhs.size_ = tmp_size;
rhs.data_ = tmp_data;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
Text t1("world!") ;
Text t2("Hello") ;
// Swap of values:
Swap(t1, t2);
std::cout << t1 << " " << t2 << std::endl;
}
```
%% Cell type:markdown id: tags:
There is even a `std::swap` in the STL that may be overloaded for your own types.
Now let's see how C++11 introduces new concepts to solve this (and many other) problems in a more elegant way.
%% Cell type:markdown id: tags:
## Reminder on references in C++03
C++ references allow you to attach a new name to an existing object in the stack or heap. All accesses and modifications made through the reference affect the original object:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
{
int var = 42;
int& ref = var; // Create a reference to var
ref = 99;
std::cout << "And now var is also 99: " << var << std::endl;
}
```
%% Cell type:markdown id: tags:
A reference can only be attached to a stable value (left value or **l-value**), which may broadly be summarized as a value which address may be taken (see [Effective modern C++](../bibliography.ipynb#Effective-Modern-C++) on this topic - its reading is especially interesting concerning this topic that is not always explained properly elsewhere - especially on the Web).
By opposition a **r-value** is a temporary value such as a literal expression or a temporary object created by implicit conversion.
%% Cell type:code id: tags:
``` C++17
``` c++
{
int& i = 42 ; // Compilation error: 42 is a r-value!
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
void Print(std::string& lvalue)
{
std::cout << "l-value is " << lvalue << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
Print("hello") ; // Compilation error: "hello" is a r-value!
}
```
%% Cell type:markdown id: tags:
Look carefully at the error message: the issue is not between `const char[6]` and `std::string` (implicit conversion from `char*` to `std::string` exists) but due to the reference; same function with pass-by-copy works seamlessly:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include <string>
void PrintByCopy(std::string value) // no reference here!
{
std::cout << "l- or r- value is " << value << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
PrintByCopy("hello") ; // Ok!
}
```
%% Cell type:markdown id: tags:
Noteworthy exception: a "constant" reference (language abuse designating a reference to a constant value) can be attached to a temporary value, in particular to facilitate implicit conversions:
%% Cell type:code id: tags:
``` C++17
``` c++
void PrintByConstRef(const std::string& lvalue)
{
std::cout << "l-value is " << lvalue << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
PrintByConstRef("hello") ; // Ok!
}
```
%% Cell type:markdown id: tags:
## C++11/14 : temporary references
To go further, C++11 introduces the concept of **r-value reference**, which can only refer to temporary values, and is declared using an `&&`.
%% Cell type:code id: tags:
``` C++17
``` c++
{
int&& i = 42;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
int j = 42;
int&& k = j; // Won’t compile: j is a l-value!
}
```
%% Cell type:markdown id: tags:
It is now possible to overload a function to differentiate the treatment to be performed according to whether it is provided with a stable value or a temporary value. Below, function `f` is provided in three variants:
```
void f(T&); // I : argument must be a l-value
void f(const T&) ; // II : argument may be l-value or r-value but can't be modified
void f(T&&); // III : argument must be a r-value
```
%% Cell type:markdown id: tags:
In case of a call of `f` with a temporary value, it is now form III that will be invoked, if it is defined. This is the cornerstone of the notion of **move semantic**.
%% Cell type:markdown id: tags:
## Function with r-value arguments
When we know that a value is temporary, we must be able to use it again, or "loot" its content without harmful consequences; _move_ it instead of _copying_ it. When handling large dynamic data structures, it can save many costly operations.
Let's take a function that receives a vector of integers and replicates it to modify it. The old way would be as follows:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include <vector>
void PrintDouble(const std::vector<int>& vec)
{
std::cout << "PrintDouble for l-value" << std::endl;
std::vector<int> copy(vec);
for (auto& item : copy)
item *= 2;
for (auto item : copy)
std::cout << item << ' ';
std::cout << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
std::vector<int> primes { 2, 3, 5, 7, 11, 13, 17, 19 };
PrintDouble(primes);
}
```
%% Cell type:markdown id: tags:
If the original object is temporary, copying it is not necessary. This can be exploited through this overload of the function:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
void PrintDouble(std::vector<int>&& vec)
{
std::cout << "PrintDouble for r-value" << std::endl;
for (auto& item : vec)
item *= 2;
for (auto item : vec)
std::cout << item << ' ';
std::cout << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
PrintDouble(std::vector<int>{ 2, 3, 5, 7, 11, 13, 17, 19 });
}
```
%% Cell type:markdown id: tags:
We can check if the r-value did not supersede entirely the l-value one; first call is still resolved by the first overload:
%% Cell type:code id: tags:
``` C++17
``` c++
{
std::vector<int> primes { 2, 3, 5, 7, 11, 13, 17, 19 };
PrintDouble(primes);
}
```
%% Cell type:markdown id: tags:
## `std::move`
Now, if we get a l-value and know we do not need it anymore in the current scope, we may choose to cast is as a r-value through a **static_cast**:
%% Cell type:code id: tags:
``` C++17
``` c++
{
std::vector<int> primes { 2, 3, 5, 7, 11, 13, 17, 19 };
PrintDouble(static_cast<std::vector<int>&&>(primes));
}
```
%% Cell type:markdown id: tags:
And we see overload call is properly the one for r-values.
The syntax is a bit heavy to type, so a shorter one was introduced as well: **`std::move`**:
%% Cell type:code id: tags:
``` C++17
``` c++
{
std::vector<int> primes { 2, 3, 5, 7, 11, 13, 17, 19 };
PrintDouble(std::move(primes)); // strictly equivalent to the static_cast in former cell!
}
```
%% Cell type:markdown id: tags:
Please notice that the call to `std::move` does not move `primes` per se. It only makes it a temporary value in the eyes of the compiler, so it is a "possibly" movable object if the context allows it; if for instance the object doesn't define a move constructor (see next section), no move will occur!
%% Cell type:markdown id: tags:
**[WARNING]** Do not use a local variable that has been moved! In our example the content of `primes` after the `std::move` call is undefined behaviour.
%% Cell type:markdown id: tags:
## Return value optimization (RVO) and copy elision
When you define a function which returns a (possibly large) object, you might be worried unneeded copy is performed:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
std::vector<unsigned int> FiveDigitsOfPi()
{
std::vector<unsigned int> ret { 3, 1, 4, 1, 5 };
return ret; // copy should be incurred here... Right? (No in fact!)
}
```
%% Cell type:markdown id: tags:
and attempt to circumvent it by a `std::move`:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
std::vector<unsigned int> FiveDigitsOfPi_WithMove()
{
std::vector<unsigned int> ret { 3, 1, 4, 1, 5 };
return std::move(ret); // Don't do that!
}
```
%% Cell type:markdown id: tags:
or even to avoid entirely returning a large object by using a reference:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
void FiveDigitsOfPi(std::vector<unsigned int>& result)
{
result = { 3, 1, 4, 1, 5 };
}
```
%% Cell type:markdown id: tags:
The second version works as you intend, but it way clunkier to use: do you prefer:
%% Cell type:code id: tags:
``` C++17
``` c++
{
auto digits = FiveDigitsOfPi();
}
```
%% Cell type:markdown id: tags:
or:
%% Cell type:code id: tags:
``` C++17
``` c++
{
std::vector<unsigned int> digits;
FiveDigitsOfPi(digits);
}
```
%% Cell type:markdown id: tags:
In fact, you shouldn't worry: all modern compilers provide a __return value optimization__ which guarantees never to copy the potentially large object created.
However, it does work only when the object is returned by value, so casting it as a rvalue reference with `std::move(ret)` actually prevents this optimization to kick up!
%% Cell type:markdown id: tags:
So to put in a nutshell, you should (almost) never use `std::move` on a return line (you may learn more about it in [this StackOverflow question](https://stackoverflow.com/questions/12953127/what-are-copy-elision-and-return-value-optimization)).
The only exception is detailed in item 25 of [Effective modern C++](../bibliography.ipynb#Effective-Modern-C++) and is very specific: it is when you want to return a value that was passed by an rvalue argument, e.g.:
%% Cell type:code id: tags:
``` C++17
``` c++
// Snippet not complete enough to work.
class Matrix; // forward declaration - don't bother yet!
Matrix Add(Matrix&& lhs, const Matrix& rhs)
{
lhs += rhs;
return std::move(lhs); // ok in this case!
}
```
%% Cell type:markdown id: tags:
This case is very limited (never needed it myself so far) so I invite you to read the item in Scott Meyer's book in you want to learn more (items 23 to 30 are really enlightening about move semantics - very recommended reading!).
%% Cell type:markdown id: tags:
## Move constructors
In classes, C++ introduced with move semantics two additional elements in the canonical form of the class:
- A **move constructor**
- A **move assignment operator**
%% Cell type:code id: tags:
``` C++17
``` c++
#include <cstring>
#include <iostream>
class Text2
{
public :
Text2(const char* string);
// Copy constructor.
Text2(const Text2& t);
// Move constructor
Text2(Text2&& t);
// Recopy operator; defined here due to an issue of Xeus-cling with operators
Text2& operator=(const Text2& t)
{
std::cout << "Operator= called" << std::endl;
if (this == &t)
return *this ; // standard idiom to deal with auto-recopy
delete [] data_;
size_ = t.size_ ;
data_ = new char[t.size_] ;
std::copy(t.data_, t.data_ + size_, data_);
return *this ;
}
// Move assignment operator; defined here due to an issue of Xeus-cling with operators
Text2& operator=(Text2&& t)
{
std::cout << "Operator= called for r-value" << std::endl;
if (this == &t)
return *this;
delete[] data_;
size_ = t.size_;
data_ = t.data_;
// Don't forget to properly invalidate `t` content:
t.size_ = 0 ;
t.data_ = nullptr ;
return *this ;
}
// Recopy operator.
Text2& operator=(const Text2& t);
// Move assignment operator
Text2& operator=(Text2&& t);
~Text2();
// Overload of operator<<, defined here due to an issue of Xeus-cling with operators.
friend std::ostream & operator<<(std::ostream& stream, const Text2& t)
{
return stream << t.data_ ;
}
// Overload of operator<<.
friend std::ostream & operator<<(std::ostream& stream, const Text2& t);
private :
unsigned int size_{0};
char* data_ = nullptr;
} ;
```
%% Cell type:code id: tags:
``` C++17
``` c++
Text2::Text2(const char* string)
{
std::cout << "Constructor called" << std::endl;
size_ = std::strlen(string) + 1;
data_ = new char[size_] ;
std::copy(string, string + size_, data_);
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
Text2::Text2(const Text2& t)
: size_(t.size_), data_(new char [t.size_])
{
std::cout << "Copy constructor called" << std::endl;
std::copy(t.data_, t.data_ + size_, data_);
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
Text2::Text2(Text2&& t)
: size_(t.size_), data_(t.data_)
{
std::cout << "Move constructor called" << std::endl;
t.size_ = 0 ;
t.data_ = nullptr ;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
Text2& Text2::operator=(const Text2& t)
{
std::cout << "Operator= called" << std::endl;
if (this == &t)
return *this ; // standard idiom to deal with auto-recopy
delete [] data_;
size_ = t.size_ ;
data_ = new char[t.size_] ;
std::copy(t.data_, t.data_ + size_, data_);
return *this ;
}
```
%% Cell type:code id: tags:
``` c++
Text2& Text2::operator=(Text2&& t)
{
std::cout << "Operator= called for r-value" << std::endl;
if (this == &t)
return *this;
delete[] data_;
size_ = t.size_;
data_ = t.data_;
// Don't forget to properly invalidate `t` content:
t.size_ = 0 ;
t.data_ = nullptr ;
return *this ;
}
```
%% Cell type:code id: tags:
``` c++
Text2::~Text2()
{
std::cout << "Destructor called" << std::endl;
delete[] data_;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
std::ostream & operator<<(std::ostream& stream, const Text2& t)
{
return stream << t.data_ ;
}
```
%% Cell type:code id: tags:
``` c++
{
Text2 t1("world!") ;
Text2 t2("Hello") ;
// Swap of values:
Text2 tmp = std::move(t1);
t1 = std::move(t2);
t2 = std::move(tmp);
std::cout << t1 << " " << t2 << std::endl;
}
```
%% Cell type:markdown id: tags:
With all this move semantics, the operations above are comparable to what we achieved with the `Swap` function for `Text` earlier... with the additional benefit that this semantic is not only used for swapping two values.
As already mentioned [there](../3-Operators/4-CanonicalForm.ipynb#[Advanced]-The-true-canonical-class), there are specific rules called __Rule of 0__, __Rule of 3__ and __Rule of 5__, which explains which constructor(s), destructor and assignment operator you ought to define for your class.
%% Cell type:markdown id: tags:
## Temporary reference argument within a function
A crucial point now: if a function receives a temporary reference argument (which can only be attached to a temporary value), within the function this argument is considered as l-value (we can perfectly put it to the left of an = and reassign a new value). If the function does not itself loot the content of the variable, and transmits it to another function (or constructor or operator), it can only reactivate its temporary character using a call to `std::move`.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include <string>
void DoStuff(std::string&& string)
{
std::cout << "Argument given by r-value is: " << string << std::endl;
string = "Bye!";
std::cout << "It was nonetheless modified as it is **inside the function** a l-value: " << string << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
DoStuff("Hello!");
}
```
%% Cell type:markdown id: tags:
## Move semantics in the STL
All containers in the standard library are now enhanced with move constructors and move assignment operators.
Moreover, the move semantics is not only about improving performance. There are classes (such as `std::unique_ptr` we'll see in [next notebook](./6-SmartPointers.ipynb#unique_ptr)) for which it makes no sense for objects to be copyable, but where it is necessary for them to be movable. In this case, the class has a constructor per move, and no constructor per copy.
An object that has been "emptied" as a result of a move is no longer supposed to be useful for anything. However if not destroyed it is still recommended, when you implement this type of class and move, to leave the emptied object in a "valid" state; that's why we put in our `Text2` class the `data_` pointer to `nullptr` and the `size_` to 0. The best is of course to ensure its destruction at short notice to avoid any mishap.
%% Cell type:markdown id: tags:
## Forwarding reference (or universal reference)
Just a quick warning (you should really read [Effective modern C++](../bibliography.ipynb#Effective-Modern-C++) for an extensive discussion on the topic; blog [FluentCpp](https://www.fluentcpp.com/2018/02/06/understanding-lvalues-rvalues-and-their-references) provides some intel about it... and tells you as well to read Scott Meyer's book to learn more!): seeing `&&` doesn't automatically mean it is a r-value reference.
There is a very specific case when:
- The argument is template
- The parameter is **exactly** `T&&` (not `std::vector<T&&>` for instance)
in which the syntax stands for either case (l-value or r-value)
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include <string>
template<class T>
void PrintUniversalRef(T&& value)
{
std::cout << value << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
PrintUniversalRef("r-value call!"); // will call a specialisation of the template for r-value
std::string hello("l-value call!"); // will call a specialisation of the template for l-value
PrintUniversalRef(hello);
}
```
%% Cell type:markdown id: tags:
Unfortunately, C++ 11 committee didn't give immediately a name to this specific call; Scott Meyers first publicized it under the name **universal reference**... and was not followed by the C++ committee that finally chose **forwarding reference**. You may therefore find one or the other term, but the idea behind is exactly the same.
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [Useful concepts and STL](./0-main.ipynb) - [Smart pointers](./6-SmartPointers.ipynb)
%% Cell type:markdown id: tags:
## Introduction
In short, **smart pointers** are the application of [RAII](./2-RAII.ipynb) to pointers: objects which handle more nicely the acquisition and release of dynamic allocation.
There are many ways to define the behaviour of a smart pointer (the dedicated chapter in [Modern C++ design](../bibliography.ipynb#Modern-C++-Design) is a very interesting read for this, especially as it uses heavily the template [policies](../4-Templates/5-MoreAdvanced.ipynb#Policies) to implement his):
* How the pointer might be copied (or not).
* When is the memory freed.
* Whether `if (ptr)` syntax is accepted
* ...
The STL made the choice of providing two (and a half in fact...) kinds of smart pointers (introduced in C++ 11):
* **unique pointers**
* **shared pointers** (and the **weak** ones that goes along with them).
One should also mention for legacy the first attempt: **auto pointers**, which were removed in C++ 17: you might encounter them in some libraries, but by all means don't use them yourself (look for *sink effect* on the Web if you want to know why).
By design all smart pointers keep the whole syntax semantic:
* `*` to dereference the (now smart) pointer.
* `->` to access an attribute of the underlying object.
Smart pointers are clearly a very good way to handle the ownership of a given object.
This does not mean they supersede entirely ordinary (often called **raw** or more infrequently **dumb**) pointers: raw pointers might be a good choice to pass an object as a function parameter (see the discussion for the third question in this [Herb Sutter's post blog](https://herbsutter.com/2013/06/05/gotw-91-solution-smart-pointer-parameters/)). The raw pointer behind a smart pointer may be accessed through the `get()` method.
Both smart pointers exposed below may be constructed directly from a raw pointer; in this case they take the responsibility of destroying the pointer:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <memory>
#include <iostream>
struct Foo
{
~Foo()
{
std::cout << "Destroy foo"<< std::endl;
}
};
{
Foo* raw = new Foo;
std::unique_ptr<Foo> unique(raw); // Now unique_ptr is responsible for pointer ownership: don't call delete
// on `raw`! Destructor of unique_ptr will call the `Foo` destructor.
}
```
%% Cell type:markdown id: tags:
## `unique_ptr`
This should be your first choice for a smart pointer.
The idea behind this smart pointer is that it can't be copied: there is exactly one instance of the smart pointer, and when this instance becomes out of scope the resources are properly released.
In C++ 11 you had to use the classic `new` syntax to create one, but C++ 14 introduced a specific syntax `make_unique`:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <memory>
{
auto ptr = std::make_unique<int>(5);
}
```
%% Cell type:markdown id: tags:
The parenthesis takes the constructor arguments.
The smart pointer can't be copied, but it can be moved:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <memory>
{
auto ptr = std::make_unique<int>(5);
auto copy = ptr; // COMPILATION ERROR: can't be copied!
}
```
%% Cell type:code id: tags:
``` C++17
#include <memory>
``` c++
%%cppmagics clang
#include <cstdlib>
#include <iostream>
#include <memory>
int main([[maybe_unused]] int argc, [[maybe_unused]] char** argv)
{
auto ptr = std::make_unique<int>(5);
auto copy = std::move(ptr);
auto moved_ptr = std::move(ptr);
// std::cout << "Beware as now there are no guarantee upon the content of ptr: " << *ptr << std::endl;
// < This line is invalid (using `ptr` after move is undefined behaviour) and makes Xeus-cling crash
std::cout << "Beware as now there are no guarantee upon the content of ptr: " << *ptr << std::endl; // EXPECTED RUNTIME ISSUE!
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
As usual with move semantics, beware in this second case: ptr is undefined after the `move` occurred... (this code run on [Coliru](http://coliru.stacked-crooked.com/a/a1aa87e64f64c9e8) leads to a more explicit segmentation fault).
As usual with move semantics, beware in this second case: ptr is undefined after the `move` occurred... hence the segmentation fault you might have got.
### Usage to store data in a class
`std::unique_ptr` are a really good choice to store objects in a class, especially ones that do not have a default constructor.
You may always define an object directly as a data attribute without pointer indirection, but in this case you have to call explicitly the constructor of the data attribute with the `:` syntax before the body of the constructor (that's exactly what we did when we introduced composition [back in the inheritance notebook](../2-ObjectProgramming/6-inheritance.ipynb#CONTAINS-A-relationship-of-composition). By using a (smart) pointer, you loosen this constraint and may define the data attribute whenever you wish, not only at construction.
The underlying object may be accessed through reference or raw pointer; usually your class may look like:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <string>
// Class which will be stored in another one through a `unique_ptr`
class Content
{
public:
Content(std::string&& text); // notice: no default constructor!
const std::string& GetValue() const;
private:
std::string text_ {};
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
Content::Content(std::string&& text)
: text_(text)
{ }
```
%% Cell type:code id: tags:
``` C++17
``` c++
const std::string& Content::GetValue() const
{
return text_;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <memory>
class WithUniquePtr
{
public:
WithUniquePtr() = default;
void Init(std::string&& text); // rather artificial here, but we want to point out it can be done anywhere and not just in constructor!
const Content& GetContent() const; // adding `noexcept` would be even better but Xeus-cling
// doesn't like it!
const Content& GetContent() const;
private:
//! Store `Content`object through a smart pointer.
std::unique_ptr<Content> content_ { nullptr };
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
void WithUniquePtr::Init(std::string&& text)
{
content_ = std::make_unique<Content>(std::move(text));
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
%%cppmagics cppyy/cppdef
#include <cassert>
const Content& WithUniquePtr::GetContent() const
{
assert(content_ != nullptr && "Make sure Init() has been properly called beforehand!");
return *content_;
}
```
%% Cell type:markdown id: tags:
Doing so:
* `Content` is stored by a `unique_ptr`, which will manage the destruction in due time of the object (when the `WithUniquePtr` object will be destroyed).
* `Content` object might be manipulated through its reference; end-user don't even need to know resource was stored through a (smart) pointer:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
void PrintContent(const Content& content)
{
std::cout << content.GetValue() << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
{
auto obj = WithUniquePtr(); // auto-to-stick syntax, to avoid most vexing parse.
obj.Init("My priceless text here!");
decltype(auto) content = obj.GetContent();
PrintContent(content);
}
```
%% Cell type:markdown id: tags:
(if you need a refresher about most vexing parse and auto-to-stick syntax, it's [here](../2-ObjectProgramming/3-constructors-destructor.ipynb#[WARNING]-How-to-call-a-constructor-without-argument)).
%% Cell type:markdown id: tags:
### Releasing a `unique_ptr`
To free manually the content of a `unique_ptr`:
To free manually the content of a `unique_ptr`, assign `nullptr` to the pointer:
* Use `release()` method:
%% Cell type:markdown id: tags:
struct Class
{
explicit Class(int a)
: a_ { a }
{ }
~Class()
{
std::cout << "Release object with value " << a_ << '\n';
}
private:
int a_ {};
};
%% Cell type:code id: tags:
``` C++17
``` c++
#include <memory>
{
auto ptr = std::make_unique<int>(5);
ptr.release(); // Beware: `.` and not `->` as it is a method of the smart pointer class, not of the
// underlying class!
auto ptr = std::make_unique<Class>(5);
ptr = nullptr;
}
```
%% Cell type:markdown id: tags:
* Or assign `nullptr` to the pointer
#### Beware: `release()` doesn't do what you might think it does!
%% Cell type:markdown id: tags:
Smart pointer classes provide a `release()` method, but what they actually release is **ownership**, not memory.
%% Cell type:code id: tags:
``` C++17
``` c++
{
auto ptr = std::make_unique<int>(5);
ptr = nullptr;
auto ptr = std::make_unique<Class>(5);
Class* raw_ptr = ptr.release(); // Beware: `.` and not `->` as it is a method of the smart pointer class, not of the
// underlying class!
}
```
%% Cell type:markdown id: tags:
As you can see, there are no call to the destructor: the role of `release()` is to release the ownership of the allocated memory to `raw_ptr`, which has now the **responsability** of freeing the memory.
What we ought to do to properly clean-up memory is therefore to call `delete` function (see [here](../1-ProceduralProgramming/5-DynamicAllocation.ipynb#Heap-and-free-store) if you need a refreshed of memory allocation).
%% Cell type:code id: tags:
``` c++
{
auto ptr = std::make_unique<Class>(5);
Class* raw_ptr = ptr.release(); // Beware: `.` and not `->` as it is a method of the smart pointer class, not of the
// underlying class!
delete raw_ptr;
}
```
%% Cell type:markdown id: tags:
## `shared_ptr`
The philosophy of `shared_ptr` is different: this kind of smart pointers is fully copyable, and each time a copy is issued an internal counter is incremented (and decremented each time a copy is destroyed). When this counter reaches 0, the underlying object is properly destroyed.
As for `unique_ptr`, there is a specific syntax to build them (properly named `make_shared`...); it was introduced earlier (C++ 11) and is not just cosmetic: the compiler is then able to store the counter more cleverly if you use `make_shared` rather than `new` (so make it so!).
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include <memory>
{
std::shared_ptr<double> ptr = std::make_shared<double>(5.);
std::cout << "Nptr = " << ptr.use_count() << std::endl;
auto ptr2 = ptr;
std::cout << "Nptr = " << ptr.use_count() << std::endl;
//< Notice the `.`: we access a method from std::shared_ptr, not from the type encapsulated
// by the pointer!
}
```
%% Cell type:markdown id: tags:
`shared_ptr` are clearly useful, but you should always wonder first if you really need them: for most uses a `unique_ptr` eventually seconded by raw pointers extracted by `get()` is enough.
There is also a risk of not releasing properly the memory is there is a circular dependency between two `shared_ptr`. A variation of this pointer named `weak_ptr` enables to circumvent this issue, but is a bit tedious to put into motion. I have written in [appendix](../7-Appendix/WeakPtr.ipynb) a notebook to describe how to do so.
%% Cell type:markdown id: tags:
## Efficient storage with vectors of smart pointers
* `std::vector` are cool, but the copy when capacity is exceeded might be very costly for some objects. Moreover, it forces you to provide copy behaviour to your classes intended to be stored in `std::vector`, which is not a good idea if you do not want them to be copied.
* An idea could be to use pointers: copy is cheap, and there is no need to copy the underlying objects when the capacity is exceeded. Another good point is that a same object might be stored in two different containers, and the modifications given in one of this is immediately "seen" by the other (as the underlying object is the same).
However, when this `std::vector` of pointers is destroyed the objects inside aren't properly deleted, provoking memory leaks.
The way to combine advantages without retaining the flaws is to use a vector of smart pointers:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <array>
class NotCopyable
{
public:
NotCopyable(double value);
~NotCopyable();
NotCopyable(const NotCopyable& ) = delete;
NotCopyable& operator=(const NotCopyable& ) = delete;
NotCopyable(NotCopyable&& ) = delete;
NotCopyable& operator=(NotCopyable&& ) = delete;
private:
std::array<double, 1000> data_;
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
NotCopyable::NotCopyable(double value)
{
data_.fill(value);
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
NotCopyable::~NotCopyable()
{
std::cout << "Call to NotCopyable destructor!" << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <iostream>
{
std::vector<std::unique_ptr<NotCopyable>> list;
for (double x = 0.; x < 8.; x += 1.1)
{
std::cout << "Capacity = " << list.capacity() << std::endl;
list.emplace_back(std::make_unique<NotCopyable>(x)); // emplace_back is like push_back for rvalues
}
}
```
%% Cell type:markdown id: tags:
Doing so:
- The `NotCopyable` are properly stored in a container.
- No costly copy occurred: there were just few moves of `unique_ptr` when the capacity was exceeded.
- The memory is properly freed when the `list` becomes out of scope.
- And as we saw in previous section, the underlying data remains accessible through reference or raw pointer if needed.
%% Cell type:markdown id: tags:
#### Using a trait as syntactic sugar
I like to create aliases in my classes to provide more readable code:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <array>
#include <vector>
class NotCopyable2
{
public:
// Trait to alias the vector of smart pointers.
using vector_unique_ptr = std::vector<std::unique_ptr<NotCopyable2>>;
NotCopyable2(double value);
NotCopyable2(const NotCopyable2& ) = delete;
NotCopyable2& operator=(const NotCopyable2& ) = delete;
NotCopyable2(NotCopyable2&& ) = delete;
NotCopyable2& operator=(NotCopyable2&& ) = delete;
private:
std::array<double, 1000> data_; // not copying it too much would be nice!
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
NotCopyable2::NotCopyable2(double value)
{
data_.fill(value);
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <iostream>
#include<vector>
{
// Use the alias
NotCopyable2::vector_unique_ptr list;
// or not: it amounts to the same!
std::vector<std::unique_ptr<NotCopyable2>> list2;
// std::boolalpha is just a stream manipulator to write 'true' or 'false' for a boolean
std::cout << std::boolalpha << std::is_same<NotCopyable2::vector_unique_ptr, std::vector<std::unique_ptr<NotCopyable2>>>() << std::endl;
}
```
%% Cell type:markdown id: tags:
This simplifies the reading, especially if templates are also involved...
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
......@@ -81,16 +81,15 @@
],
"metadata": {
"kernelspec": {
"display_name": "C++17",
"language": "C++17",
"name": "xcpp17"
"display_name": "Cppyy",
"language": "c++",
"name": "cppyy"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"codemirror_mode": "c++",
"file_extension": ".cpp",
"mimetype": "text/x-c++src",
"name": "c++",
"version": "17"
"name": "c++"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [Useful concepts and STL](./0-main.ipynb) - [Algorithms](./7-Algorithms.ipynb)
%% Cell type:markdown id: tags:
## Introduction
%% Cell type:markdown id: tags:
Even if C++ can't be qualified as a _batteries included_ language like Python (until C++ 17 there was no proper filesystem management, and the support of this feature was still shaky at best in several STL implementations circa 2019...), there are plenty of algorithms that are already provided within the STL.
We won't obviously list them all here - the mighty [The C++ Standard Library: A Tutorial and Reference](../bibliography.ipynb#The-C++-Standard-Library:-A-Tutorial-and-Reference) which is more than 1000 pages long don't do it either! - but show few examples on how to use them. For instance, many STL algorithms rely upon iterators: this way a same algorithm may be used as well on `std::vector`, `std::list`, and so on...
A side note: if a STL class provides a method which has a namesake algorithm, use the method. For instance there is a `std::sort` algorithm, but `std::list` provides a method which takes advantage on the underlying structure of the object and is therefore much more efficient.
## Example: `std::sort`
%% Cell type:code id: tags:
``` C++17
``` c++
#include <algorithm>
#include <vector>
#include <iostream>
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100 };
std::sort(int_vec.begin(), int_vec.end());
for (auto item : int_vec)
std::cout << item << " ";
```
%% Cell type:code id: tags:
``` C++17
``` c++
#include <algorithm>
#include <deque>
#include <iostream>
std::deque<double> double_deque { -9., 87., 11., 0., -21., 100. };
std::sort(double_deque.begin(), double_deque.end(), std::greater<double>()); // optional third parameter is used
for (auto item : double_deque)
std::cout << item << " ";
```
%% Cell type:markdown id: tags:
As you can see, the same algorithm works upon two different types of objects. It works with non constant iterators; an optional third argument to `std::sort` enables to provide your own sorting algorithm.
Lambda functions may be used as well to provide the comparison to use:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <algorithm>
#include <vector>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100 };
std::sort(int_vec.begin(), int_vec.end(),
[](auto lhs, auto rhs)
{
const bool is_lhs_even = (lhs % 2 == 0);
const bool is_rhs_even = (rhs % 2 == 0);
// Even must be ordered first, then odds
// Granted, this is really an oddball choice..
if (is_lhs_even && !is_rhs_even)
return true;
if (is_rhs_even && !is_lhs_even)
return false;
return lhs < rhs;
});
for (auto item : int_vec)
std::cout << item << " ";
}
```
%% Cell type:markdown id: tags:
Of course, we may use this on something other than `begin()` and `end()`; we just have to make sure iterators are valid:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <algorithm>
#include <iterator>
#include <vector>
#include <iostream>
#include <cassert>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100 };
// equivalent to 'auto it = int_vec.begin() + 4;'
// but std::advance is a more generic way to increment iterators on ranges. Requires #include <iterator>
auto it = int_vec.begin();
std::advance(it, 4);
assert(it < int_vec.end()); // Important condition to check iterator means something!
std::sort(int_vec.begin(), it); // Only first four elements are sort.
for (auto item : int_vec)
std::cout << item << " ";
}
```
%% Cell type:markdown id: tags:
## `std::find`
I will also show examples of `std::find` as it provides an additional common practice: it returns an iterator, and there is a specific behaviour if the algorithm failed to find something.
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iterator>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21 };
const auto it = std::find(int_vec.cbegin(), int_vec.cend(), -21);
if (it != int_vec.cend())
// equivalent to `it - int_vec.cbegin()`
// but std::distance is a more generic way to get iterators position in ranges. Requires #include <iterator>
std::cout << "Found at position " << std::distance(int_vec.cbegin(), it) << std::endl;
else
std::cout << "Not found." << std::endl;
}
```
%% Cell type:markdown id: tags:
As you can see, `std::find` returns the first instance in the iterator range (and you can also do arithmetic over the iterators). You may know how many instances there are with `std::count`:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
const auto count = std::count(int_vec.cbegin(), int_vec.cend(), -21);
std::cout << "There are " << count << " instances of -21." << std::endl;
}
```
%% Cell type:markdown id: tags:
If you want to use a condition rather than a value, there are dedicated versions of the algorithms to do so:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
const auto count = std::count_if(int_vec.cbegin(), int_vec.cend(),
[](int value)
{
return value % 2 == 0;
});
std::cout << "There are " << count << " even values in the list." << std::endl;
}
```
%% Cell type:markdown id: tags:
## Output iterators and `std::back_inserter`
Some algorithms require output iterators: they don't work uniquely upon existing content but need to shove new data somewhere. You must in this case provide the adequate memory beforehand:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
std::vector<int> odd_only;
std::copy_if(int_vec.cbegin(), int_vec.cend(),
odd_only.begin(),
[](int value)
{
return value % 2 != 0;
}
); // SHOULD MAKE YOUR KERNEL CRASH!
}
```
%% Cell type:markdown id: tags:
The issue is that the memory is not allocated first: the algorithm doesn't provide the memory at destination! (the reason is that an algorithm is as generic as possible; here `std::copy_if` is expected to work as well with `std::set`... and `std::vector` and `std::set` don't use the same API to allocate the memory).
Of course, in some cases it is tricky to know in advance what you need, and here computing it previously with `std::count_if` adds an additional operation. There is actually a way to tell the program to insert the values by `push_back` with `std::back_inserter`; it might be a good idea to reserve enough memory to use this method without recopy:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
std::vector<int> odd_only;
odd_only.reserve(int_vec.size()); // at most all elements of int_vec will be there
std::copy_if(int_vec.cbegin(), int_vec.cend(),
std::back_inserter(odd_only),
[](int value)
{
return value % 2 != 0;
}
);
// And if you're afraid to have used too much memory with your `reserve()` call,
// you may call shrink_to_fit() method here.
std::cout << "The odd values are: ";
for (auto item : odd_only)
std::cout << item << " ";
}
```
%% Cell type:markdown id: tags:
## The different kinds of operators
`std::back_inserter` works only with containers that provide a `push_back()` method. This may be generalized: the fact that algorithms rely upon iterators to make them as generic as possible doesn't mean each algorithm will work on any container.
There are actually several kinds of iterators:
* **[Forward iterators](http://www.cplusplus.com/reference/iterator/ForwardIterator/)**, which you may only iterate forward. For instance `std::forward_list` or `std::unordered_map` provide such iterators.
* **[Bidirectional iterators](http://www.cplusplus.com/reference/iterator/BidirectionalIterator/)**, which way you may also iterate backward. For instance `std::list` or `std::map` provide such iterators.
* **[Random-access iterators](http://www.cplusplus.com/reference/iterator/RandomAccessIterator/)**, which are bidirectional operators with on top of it the ability to provide random access (through an index). Think of `std::vector` or `std::string`.
When you go on [cppreference](https://en.cppreference.com/w/) (or in [The C++ Standard Library: A Tutorial and Reference](../bibliography.ipynb#The-C++-Standard-Library:-A-Tutorial-and-Reference) the name of the template parameter explicitly describes which kind of iterator is actually used.
Besides this classification, there are also in algorithms the difference between **input iterators** (which are read-only) and **output iterators** that assume you will write new content there.
%% Cell type:markdown id: tags:
## Algorithm: read the documentation first!
You should really **carefully read the documentation** before using an algorithm: it might not behave as you believe...
I will provide two examples:
%% Cell type:markdown id: tags:
### std::remove_if
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
std::remove_if(int_vec.begin(), int_vec.end(),
[](int value)
{
return value % 2 != 0;
});
std::cout << "The even values are (or not...): ";
for (auto item : int_vec)
std::cout << item << " ";
std::cout << std::endl;
}
```
%% Cell type:markdown id: tags:
So what happens? [cplusplus.com](http://www.cplusplus.com/reference/algorithm/remove_if/) tells that it _Transforms the range \[first,last) into a range with all the elements for which pred returns true removed, and returns an iterator to the new end of that range_, where _pred_ is the third parameter of `std::remove_if` (the lambda function in our case).
In other words, `std::remove_if`:
* Place at the beginning of the vector the values to be kept.
* Returns an iterator to the **logical end** of the expected series...
* But does not deallocate the memory! (and keeps the container's `size()` - see below)
So to print the relevant values only, you should do:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
auto logical_end = std::remove_if(int_vec.begin(), int_vec.end(),
[](int value)
{
return value % 2 != 0;
});
std::cout << "The even values are: ";
for (auto it = int_vec.cbegin(); it != logical_end; ++it) // see the use of `logical_end` here!
std::cout << *it << " ";
std::cout << std::endl;
std::cout << "But the size of the vector is still " << int_vec.size() << std::endl;
}
```
%% Cell type:markdown id: tags:
And if you want to reduce this size, you should use the `std::vector::erase()` method:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
auto logical_end = std::remove_if(int_vec.begin(), int_vec.end(),
[](int value)
{
return value % 2 != 0;
});
int_vec.erase(logical_end, int_vec.end());
std::cout << "The even values are: ";
for (auto item : int_vec)
std::cout << item << " ";
std::cout << std::endl;
std::cout << "And the size of the vector is correctly " << int_vec.size() << std::endl;
}
```
%% Cell type:markdown id: tags:
### std::unique
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
std::unique(int_vec.begin(), int_vec.end());
std::cout << "The unique values are (or not...): ";
for (auto item : int_vec)
std::cout << item << " ";
std::cout << std::endl;
}
```
%% Cell type:markdown id: tags:
So what's happen this time? If you look at [cppreference](http://www.cplusplus.com/reference/algorithm/unique/) you may see the headline is _Remove **consecutive** duplicates in range_.
So to make it work you need to sort it first (or use a home-made algorithm if you need to preserve the original ordering):
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
std::sort(int_vec.begin(), int_vec.end());
std::unique(int_vec.begin(), int_vec.end());
std::cout << "The unique values are (really this time?): ";
for (auto item : int_vec)
std::cout << item << " ";
std::cout << std::endl;
}
```
%% Cell type:markdown id: tags:
We still got too many values... but if you got the `remove_if` example the reason must be obvious; the fix is exactly the same:
%% Cell type:code id: tags:
``` C++17
``` c++
#include <vector>
#include <algorithm>
#include <iostream>
{
std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };
std::sort(int_vec.begin(), int_vec.end());
auto logical_end = std::unique(int_vec.begin(), int_vec.end());
std::cout << "The unique values are (really this time!): ";
for (auto it = int_vec.begin(); it != logical_end; ++it)
std::cout << *it << " ";
std::cout << std::endl;
}
```
%% Cell type:markdown id: tags:
Personally I have in my Utilities library a function `EliminateDuplicate()` which calls both in a row:
%% Cell type:code id: tags:
``` C++17
``` c++
template<class T>
void EliminateDuplicate(std::vector<T>& vector)
{
std::sort(vector.begin(), vector.end());
vector.erase(std::unique(vector.begin(), vector.end()), vector.end());
}
```
%% Cell type:markdown id: tags:
## Parallel execution of algorithm
C++ 17 introduced an optional first argument for many STL algorithms to enable to use them in parallel and/or vectorized content; see [here](https://en.cppreference.com/w/cpp/algorithm/execution_policy_tag) on Cppreference for more details.
As indicated there there are currently four policies available:
- `std::execution::sequenced_policy`
- `std::execution::parallel_policy`
- `std::execution::parallel_unsequenced_policy`
- `std::execution::unsequenced_policy`
with others under study (`std::parallel::cuda` and `std::parallel::opencl` are mentioned.
I have no direct experience with it as it is not wildly supported yet by compilers (my very recent Apple Clang 15.0.0 from February 2024 doesn't support it yet) and relies upon other libraries under the hood such as tbb.
However, it is a compelling argument to use algorithms instead of hand-made loops and functions: it should provide an easy way to optimize your code (even if as noted [on Cppreference](https://en.cppreference.com/w/cpp/algorithm/execution_policy_tag_t) you need to take usual precautions against data races).
%% Cell type:markdown id: tags:
## Conclusion
My point was absolutely not to tell you not to use the STL algorithms; on the contrary it is better not to reinvent the wheel, especially considering you would likely end up with a less efficient version of the algorithm!
You need however to be very careful: sometimes the names are unfortunately misleading, and you should always check a function does the job you have in mind. Algorithms were written to be as generic as possible, and can't do some operations such as allocate or deallocate memory as it would break this genericity.
I have barely scratched the surface; many algorithms are extremely useful. So whenever you want to proceed with a transformation that is likely common (check a range is sorted, partition a list in a specific way, finding minimum and maximum, etc...) it is highly likely the STL has something in store for you.
The reading of [Functional programming in C++](../bibliography.ipynb#Functional-Programming-in-C++) should provide more incentive to use them.
It is also important to highlight that while the STL algorithms may provide you efficiency (this library is written by highly skilled engineers after all), this is not its main draw: the algorithms are written to be as generic as possible. The primary reason to use them is to allow you to think at a higher level of abstraction, not to get the fastest possible implementation. So if your ~~intuition~~ benchmarking has shown that the standard library is causing a critical slowdown, you are free to explore classic alternatives such as [loop unrolling](https://en.wikipedia.org/wiki/Loop_unrolling) - that's one of the strength of the language (and the STL itself opens up this possibility directly for some of its construct - you may for instance use your own memory allocator when defining a container). For most purposes however that will not be necessary.
FYI, C++ 20 introduces a completely new way to deal with algorithms, which does not rely on direct use of iterators but instead on a range library. This leads to a syntax which is more akin to what is done in other languages - see for instance this example [@Coliru](https://coliru.stacked-crooked.com/a/efbfb359b4dfa6ee):
FYI, C++ 20 introduces a completely new way to deal with algorithms, which does not rely on direct use of iterators but instead on a range library. This leads to a syntax which is more akin to what is done in other languages:
%% Cell type:code id: tags:
``` C++17
// C++ 20: does not run in Xeus-cling!
``` c++
%%cppmagics clang
// std::views not yet well handled by cppyy
#include <iostream>
#include <ranges>
#include <vector>
int main(int argc, char** argv)
int main([[maybe_unused]] int argc, [[maybe_unused]] char** argv)
{
std::vector<int> numbers = { 3, 5, 12, 17, 21, 27, 28 };
auto results = numbers | std::views::filter([](int n){ return n % 3 == 0; })
| std::views::transform([](int n){ return n / 3; });
for (auto v: results)
std::cout << v << " "; // 1 4 7 9
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
Having no first hand experience of it I really can't say more about it but don't be astonished if you meet such a syntax in a C++ program; you may learn a bit more for instance [here](https://www.modernescpp.com/index.php/c-20-the-ranges-library).
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
......@@ -82,16 +82,15 @@
],
"metadata": {
"kernelspec": {
"display_name": "C++17",
"language": "C++17",
"name": "xcpp17"
"display_name": "Cppyy",
"language": "c++",
"name": "cppyy"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"codemirror_mode": "c++",
"file_extension": ".cpp",
"mimetype": "text/x-c++src",
"name": "c++",
"version": "17"
"name": "c++"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
......
......@@ -41,16 +41,15 @@
],
"metadata": {
"kernelspec": {
"display_name": "C++17",
"language": "C++17",
"name": "xcpp17"
"display_name": "Cppyy",
"language": "c++",
"name": "cppyy"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"codemirror_mode": "c++",
"file_extension": ".cpp",
"mimetype": "text/x-c++src",
"name": "c++",
"version": "17"
"name": "c++"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [C++ in a real environment](./0-main.ipynb) - [Set up environment](./1-SetUpEnvironment.ipynb)
%% Cell type:markdown id: tags:
## Introduction
I will present here briefly how to set up a minimal development environment... only in Unix-like systems: sorry for Windows developers, but I have never set up a Windows environment for development. You may have a look at WSL, which is gaining traction and enables you to use Linux inside Windows.
This will explain installation for two mainstreams compilers: [GNU compiler for C++](https://en.wikipedia.org/wiki/GNU_Compiler_Collection) and [clang++](https://en.wikipedia.org/wiki/Clang).
## Installing a compiler
**Note:** Compilers themselves will be addressed more in depth in an [upcoming notebook](3-Compilers.ipynb).
### Ubuntu / Debian
%% Cell type:markdown id: tags:
#### Clang
To install `clang` you need to specify explicitly the required version, e.g.:
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
sudo apt-get install -y clang++-13
```
%% Cell type:markdown id: tags:
#### Default g++
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
sudo apt-get install -y g++
```
%% Cell type:markdown id: tags:
#### More recent gcc
However, Ubuntu is rather conservative and the version you get might be a bit dated and it might be problematic if you intend to use the bleeding-edged features from the latest C++ standard (even if it's now better than what it used to be).
In February 2024, default Ubuntu provided gcc 12; if you want gcc 13 you need to use the following [PPA](https://launchpad.net/ubuntu/+ppas):
(_disclaimer_: the instructions below have been tested in a Docker image - with `RUN` instead of `sudo` of course - so all the lines might not be necessary in a full-fledged Ubuntu distro)
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
// Update the Ubuntu environment
sudo apt-get update && sudo apt-get upgrade -y -q && sudo apt-get dist-upgrade -y -q && sudo apt-get -y -q autoclean && sudo apt-get -y -q autoremove)
// To enable `add-apt-repository` command; probably not required in a full-fledged installation
// but you will need it in a Docker image for instance
sudo apt-get install --no-install-recommends -y software-properties-common gpg-agent wget
// Adding PPA and making its content available to `apt`.
sudo add-apt-repository ppa:ubuntu-toolchain-r/test
sudo apt-get update -y
// Installing the more recent gcc
sudo apt-get install -y g++-13
```
%% Cell type:markdown id: tags:
And to tell the system which version to use, the command is:
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
sudo update-alternatives --install /usr/bin/g++ g++ /usr/bin/g++-13 100
```
%% Cell type:markdown id: tags:
More realistically, you will install gcc and perhaps gfortran as well; the following command make sure all are kept consistent (you do not want to mesh gcc 12 with g++ 13 for instance...):
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-13 100
--slave /usr/bin/g++ g++ /usr/bin/g++-13
--slave /usr/bin/gfortran gfortran /usr/bin/gfortran-13
```
%% Cell type:markdown id: tags:
### Fedora
For development purposes I rather like Fedora, which provides more recent versions than Ubuntu and also a simple clang installation.
%% Cell type:markdown id: tags:
#### g++
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
sudo dnf install -y gcc gcc-c++ gcc-gfortran
```
%% Cell type:markdown id: tags:
#### clang++
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
sudo dnf install -y clang
```
%% Cell type:markdown id: tags:
### macOS
* Install XCode from the AppStore
* Install if requested the command line tools
This will install _Apple Clang_ which is a customized version of `clang`. Unfortunately, Apple stopped providing the version of mainstream clang it is built upon; it's therefore more difficult to check if a new standard feature is already supported or not.
It is also possible to install gcc on macOS; I personally use [Homebrew](https://brew.sh) to do so. However using it is not always reliable (on STL side at least - see below), especially shortly after a new version of XCode / Command Line Tools has been published; I would advise to use primarily clang.
%% Cell type:markdown id: tags:
## STL
Besides the compiler, you may also choose which implementation of the STL you want to use. There are two mainstream choices:
- `libstdc++`, which is the STL provided along gcc by GNU. This is the default choice for many Linux distro, and there is a good chance the binaries, libraries and share libraries in your package system were compiled with this one.
- `libc++`, which is the STL provided along clang by LLVM. It is the default choice on macOS, and was until recently a pain to use with Ubuntu (according to Laurent it is much better now in Ubuntu 20.04 and more recent versions).
Both are pretty solid choices:
- Going with `libstdc++` is not a bad idea if you're using with your code libraries from your package manager that use this STL implementation (likely in a Linux distro).
- Going with `libc++` along with clang++ seems rather natural as well.
%% Cell type:markdown id: tags:
Just a note for compatibility: `libc++` tends to provide more `include` directive in its header files than `libstdc++`. So don't be astonished if your code compiles well with `libc++` but complains about an unknown symbol from STL with `libstdc++` (and the patch is simply to use the missing include - a tool such as IncludeWhatYouUse would have underlined the missing include even when using `libc++`).
%% Cell type:markdown id: tags:
## Editor
%% Cell type:markdown id: tags:
You will need a [source code editor](https://en.wikipedia.org/wiki/Source_code_editor) to write your code; usually your system will provide a very basic one and you may be able to install another on your own.
From my experience, there are essentially two types of developers:
- Those that revel in using [VIM](https://en.wikipedia.org/wiki/Vim_(text_editor)) **or** [emacs](https://en.wikipedia.org/wiki/Emacs), which are lightweight editors that have been around for decades and are rather powerful once you've climbed a (very) steep learning curve.
- Those that will like [integrated development environment](https://en.wikipedia.org/wiki/Integrated_development_environment) more, which provides more easily some facilities (that can often be configured the hard way in the aforementioned venerable text editors) but are more resource-intensive.
I suggest you take whichever you're the most comfortable with and don't bother about the zealots that tell you their way is absolutely the best one.
_Vim_ and _emacs_ are often either already installed or available easily with a distribution package (_apt_, _dnf_, etc...); for IDEs here are few of them:
* [Visual Studio Code](https://code.visualstudio.com/), which gained traction in last few years and is one of the most sizeable GitHub project. This is an open-source and multi-platform editor maintained by Microsoft, not to be confused with [Visual Studio](https://visualstudio.microsoft.com/?rr=https%3A%2F%2Fwww.google.com%2F) - also provided by Microsoft on Windows (and with a fee).
* [CLion](https://www.jetbrains.com/clion/) by JetBrains is also a rising star in IDEs; a free version is available for students (you may know PyCharm by the same company)
* [Eclipse CDT](https://www.eclipse.org/cdt/) and [NetBeans](https://netbeans.org/) are other IDEs with more mileage.
* [QtCreator](https://www.qt.io/qt-features-libraries-apis-tools-and-ide) is not just for Qt edition and might be used as a C++ IDE as well.
* [XCode](https://developer.apple.com/xcode) is the editor provided by Apple on macOS.
* [KDevelop](https://www.kdevelop.org/) is the IDE from the KDE project.
* [JupyterLab](https://jupyter.org/) this very same notebook lab can be used as IDE after the last improvements and extensions added, [see this](https://towardsdatascience.com/jupyter-is-now-a-full-fledged-ide-c99218d33095) and how include the [VS Code Monaco Editor](https://imfing.medium.com/bring-vs-code-to-your-jupyterlab-187e59dd1c1b). I wouldn't advise it but if you're really keen to roll with it.
%% Cell type:markdown id: tags:
## Software configuration manager
A [software configuration manager](https://en.wikipedia.org/wiki/Software_configuration_management), sometimes abbreviated as **SCM**, is important when you're writing code that is meant to stay at least a while (and very handy even if that is not the case).
It is useful not only when you're working with someone else: if at some point you're lost in your code and don't understand why what was working perfectly few hours ago is now utterly broken it is really helpful to be able to compare what has changed since this last commit.
The most obvious choice for a SCM is [git](https://git-scm.com) which is now widely abroad and has become the _de facto_ standard. _git_ is extremely versatile but you can already do a lot of version control with around 10 commands so the learning curve is not as steep as you may fear.
git is generally already installed on your system or readily available through your package manager (or by installing XCode and its tools on macOS).
There are many tutorials about how to use Git; including some from Inria:
- [Support](https://gitlab.inria.fr/aabadie/formation-git) used in the training sessions by Alexandre Abadie (SED Paris).
- A [nice tutorial](https://tutorial.gitlabpages.inria.fr/git/) from Mathias Malandain (SED Rennes) to learn to use Git and Gitlab at the same time.
There are also fairly frequent training sessions organized in the Inria centers; ask your local SED if you're interested by one!
%% Cell type:markdown id: tags:
## Build system
Handling properly the compilation of the code is not an easy task: many tutorial skip entirely the topic or just show a very basic example that is very far removed from a real project with potentially many third-party dependencies. This is understandable (and I will mostly do the same): using properly a build system is not trivial and may be the topic on a full lecture of its own.
The usual possibilities are:
* Build system provided by your IDE. Might be easier to use (definitely the case for XCode which I'm familiar with once you grasp how it is intended to work) but you bind your potential users to use the same IDE (even if now some relies upon CMake).
* [Makefile](https://en.wikipedia.org/wiki/Makefile) is the venerable ancestor, which is really too painful to write and not automated enough for my taste.
* [Ninja](https://ninja-build.org) is presented on this website as _a small build system with a focus on speed. It differs from other build systems in two major respects: it is designed to have its input files generated by a higher-level build system, and it is designed to run builds as fast as possible_. It is my favorite generator to use with CMake; meson also enables usage of Ninja under the hood.
* [CMake](https://cmake.org) is the build system probably with the more traction now; it is a cross-platform build system which is rather powerful but not that easy to learn. Official documentation is terse; you may try [this](https://cliutils.gitlab.io/modern-cmake/) or [that](https://cgold.readthedocs.io/en/latest/) to understand it better. Please notice CMake was heavily changed when switching from version2 to version 3; take a recent documentation if you want to learn "modern" CMake. The principle of CMake is to provide a generic configuration that may be used for different build tools: by default you generate a Makefile, but you may choose another generator such as Ninja (see above) or a specific IDE.
* [meson](https://mesonbuild.com/) is a more recent alternative which aims to be simpler to use than CMake. Never used it so can't say much about it.
* [SCons](https://www.scons.org/) is a build system built upon Python which lets you write your own Python functions in the build system. The concept is appealing, but the actual use is actually dreadful and the provided build is much slower than what other build system provides. Avoid it!
We will illustrate in next notebook a basic use of CMake.
**Important:** Nowadays build systems can leverage the powerful computer on which they run and use several processors at the same time. Depending on the tool you use, the default build might be sequential or parallel (on the few I have used, only `ninja` assumes a parallel build by default). Make sure you know how your build tool works and that you're leveraging parallel builds!
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
%% Cell type:markdown id: tags:
# [Getting started in C++](./) - [C++ in a real environment](/notebooks/6-InRealEnvironment/0-main.ipynb) - [File structure in a C++ program](/notebooks/6-InRealEnvironment/2-FileStructure.ipynb)
%% Cell type:markdown id: tags:
## Library and program
Contrary to for instance Python or Ruby, C++ is not a scripting language: it is intended to build either an **executable** or **library**.
To summarize:
* An **executable** runs the content of the [`main() function`](../1-ProceduralProgramming/4-Functions.ipynb#A-very-special-function:-main). There should be exactly one such function in all the compiled files; the file with this `main` must be compiled.
* A **library** is a collection of functions, classes and so on that might be used in a program. A library may be **header-only**: in this case it is just an ensemble of header files with no file compiled. In this case all the definitions must be either **inline** or **template** (and possibly both of course).
### Static and shared libraries
A (non header) library may be constructed as one of the following type:
* A **static** library, usually with a **.a** extension, is actually included directly into any executable that requires it. The advantage is that you just need the bare executable to run your code: the library is no longer required at runtime. The inconvenient is that the storage space may balloon up rather quickly: each executable will contain the whole library!
* A **shared** library, which extension may vary wildly from one OS to another (**.dylib**, **.so**, **.dll**, etc...), is on the other hand required at runtime by the executable that was built with it. The advantage is that executables are thus much smaller. They are often described on the Web as the way to go; my personal experience with them is however less rosy as each OS handles them differently (noticeably the way to indicate in which location the dynamic libraries should be looked at differ rather wildly...)
The best if possible is to enable generation of your library in either type... but it requires a bit of work with your build system.
## Source file
Contrary to most of more modern languages, C++ relies upon two very specific kind of files, each of which with their own extension schemes. We will introduce first the source file, with which basic programs might be achieved, and then show why header files are also needed.
### Compilation of _Hello world!_
A source file is a type of file intended to be **compiled**.
Let's consider the seminal _Hello world_ in a dedicated source file named _hello.cpp_ (all the examples here are made available in `2c-Demo` directory; this one is `01-HelloWorld`):
%% Cell type:code id: tags:
``` C++17
``` c++
// File hello.cpp - I put "Code" as cell type in Jupyter to get nice colors but it's not intended
// to be executed in the cell!
#include <iostream>
int main(int argc, char** argv)
{
std::cout << "Hello world!" << std::endl;
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
To compile it on a Unix system, you will need to type in your terminal a line that looks like (with at least [GNU compiler for C++](https://en.wikipedia.org/wiki/GNU_Compiler_Collection) and [clang++](https://en.wikipedia.org/wiki/Clang)):
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
g++ -std=c++17 hello.cpp -o hello
```
%% Cell type:markdown id: tags:
where:
- `g++` is the name of the compiler. You may provide `clang++` if you wish.
- `-std=c++17` tells to use this version of the standard. If not specified the compilers tend to assume C++ 11 but may issue warnings if some features introduced with this standard are used.
- `hello.cpp` is the name of the source file.
- `hello` is the name of the executable produced. If the `-o hello` is omitted, the executable is arbitrarily named `a.out`, exactly as in C.
%% Cell type:markdown id: tags:
The executable may then be used with:
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
./hello
```
%% Cell type:markdown id: tags:
The `./` is there to specify the executable is to be looked at in current path; it may be omitted if `.` is present in the system `PATH` environment variable.
Please notice the name of the file with the `main()` function and the name of the executable are completely custom; you have no requirement on the names of files and executable.
%% Cell type:markdown id: tags:
If your current machine has the compilers installed it is possible to execute these compilation commands instead of opening the terminal use the ! symbol as follows:
%% Cell type:code id: tags:
``` C++17
``` c++
!g++ -std=c++17 ./2c-Demo/01-HelloWorld/hello.cpp -o hello
```
%% Cell type:code id: tags:
``` C++17
``` c++
!./hello
```
%% Cell type:markdown id: tags:
### Source files extensions
The plural is not a mistake: unfortunately, contrary to many languages, there is no universal convention upon the extensions to use for C++ files. There are widely spread conventions, but a library may choose not to follow them.
Editors and IDE know the most common ones and usually provide a way to add your own spin so that they may provide language recognition and all that goes with it (colored syntax, completion helper and so on).
The most common extensions are **.cpp**, **.cc**, **.C** and more seldom **.cxx**.
My advice would be to choose one and stick to it; the only one I warn against is **.C** because some operating systems (such as macOS) are case-insensitive by default and **.c** is a more common convention for C programs.
%% Cell type:markdown id: tags:
### Expanding our hello program with two source files: one for main, one for the function
This code is not very subtle: everything is in the same file, so we are in a very simplistic case in which only one file is compiled, and there are no need to find ways to specify how several files relate to each other.
You may imagine working with a single file is not a very common option: it hinders reusability, and it would be cumbersome to navigate in a file with thousands or more lines of code (if you're curious about an extreme case, have a look at the amalgamation ([2.28 Mo zip here](https://www.sqlite.org/2020/sqlite-amalgamation-3310100.zip)) of sqlite code, in which all the code is put in a same source file...)
We now want to separate the main() and the actual content of the code (also in `2c-Demo/02-InTwoFilesWithoutHeader`):
%% Cell type:code id: tags:
``` C++17
``` c++
// File hello.cpp - no main inside
#include <iostream>
void hello()
{
std::cout << "Hello world!" << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File main.cpp
#include <cstdlib> // for EXIT_SUCCESS
int main(int argc, char** argv)
{
hello();
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
This brute force method is not working: a line on a terminal like:
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
clang++ -std=c++17 hello.cpp main.cpp -o hello
```
%% Cell type:markdown id: tags:
would yield something like:
```verbatim
main.cpp:5:5: error: use of undeclared identifier 'hello'
hello();
^
1 error generated.
```
%% Cell type:markdown id: tags:
## Header file
The issue above is that we need to inform the compiler when it attempts to compile `main.cpp` that `hello()` function is something that exists. We need to **declare** it in a dedicated **header file** and **include** this file in each source file that needs it (also in `2c-Demo/03-InTwoFilesWithHeader`):
%% Cell type:code id: tags:
``` C++17
``` c++
// File hello.hpp
void hello();
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File main.cpp
#include <cstdlib> // for EXIT_SUCCESS
#include "hello.hpp"
int main(int argc, char** argv)
{
hello();
return EXIT_SUCCESS;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File hello.cpp - no main inside
#include <iostream>
#include "hello.hpp"
void hello()
{
std::cout << "Hello world!" << std::endl;
}
```
%% Cell type:markdown id: tags:
With this few changes, the command line:
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
clang++ -std=c++17 hello.cpp main.cpp -o hello
```
%% Cell type:markdown id: tags:
works as expected and creates a valid `hello` executable (also note the header file is not required explicitly in this build command).
As in the previous case we may directly compile from here using the ! symbol as follows (if compilers are present in the environment):
%% Cell type:code id: tags:
``` C++17
``` c++
!g++ -std=c++17 2c-Demo/03-InTwoFilesWithHeader/hello.cpp 2c-Demo/03-InTwoFilesWithHeader/main.cpp -o hello
```
%% Cell type:code id: tags:
``` C++17
``` c++
!./hello
```
%% Cell type:markdown id: tags:
### Header location
You may have noticed that in the previous call to compile the executable the header file wasn't provided explicitly.
`hello.hpp` was found because it was in the current folder. Let's suppose now we want to put include files in a directory named `incl`; to make it work we have actually two ways:
* Either modifying the path in the source file. We would get
```c++
#include "incl/hello.hpp"
```
in both `hello.cpp` and `main.cpp`.
* Or by giving to the command line the `-I` instruction to indicate which path to look for (`2c-Demo/04-SpecifyHeaderDirectory`):
%% Cell type:code id: tags:
``` C++17
``` c++
// In a terminal
clang++ -std=c++17 -Iincl hello.cpp main.cpp -o hello
```
%% Cell type:markdown id: tags:
As many `-I` as you wish may be provided on the command line; I would recommend not providing too many as it increases the risk of an ambiguity if two header files at different path are named likewise:
```verbatim
incl/foo.hpp
bar/incl/foo.hpp
```
and
```shell
clang++ -Iincl -Ibar/incl main.cpp
```
leads to an ambiguity if there is `#include "foo.hpp"` in the `main.cpp`...
%% Cell type:markdown id: tags:
### `""` or `<>`?
You may have noticed I sometimes used `<>` and sometimes `""` to specify the path for the include.
The details don't matter that much in most cases, but it is better to:
* Use `<>` only for the system libraries, typically STL or C headers should be this form.
* Use `""` for your headers or for third-party libraries installed in specific locations.
If you want a bit more details:
* `""` will look first in the current directory, and then in the header files directories.
* `<>` will look only in the header files directories.
%% Cell type:markdown id: tags:
### Header guards and #pragma once
During compilation, the `#include` command is actually replaced by the content of the file which path is provided here. We therefore may quickly include twice the same content (`2c-Demo/05-NoHeaderGuards`):
%% Cell type:code id: tags:
``` C++17
``` c++
// File foo.hpp
class Foo
{ };
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File main.cpp
#include <cstdlib>
#include "foo.hpp"
#include "foo.hpp" // Oops...
int main()
{
return EXIT_SUCCESS;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
// In terminal
clang++ -std=c++17 main.cpp -o does_not_compile
```
%% Cell type:markdown id: tags:
doesn't compile: the translation unit provides two declarations of class Foo!
This might seem a simple enough mistake to fix it, but in a project with few header files that might be intricated it becomes quickly too much a hassle (`2c-Demo/06-MoreSubtleNoHeaderGuards`):
%% Cell type:code id: tags:
``` C++17
``` c++
// File foo.hpp
class Foo
{ };
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File bar.hpp
#include "foo.hpp"
struct Bar
{
Foo foo_;
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File main.cpp
#include <cstdlib>
#include "foo.hpp"
#include "bar.hpp" // Compilation error: "foo.hpp" is sneakily included here as well!
int main()
{
Bar bar;
return EXIT_SUCCESS;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
// In terminal
clang++ -std=c++17 main.cpp -o does_not_compile
```
%% Cell type:markdown id: tags:
The patch is to indicate in each header file that it should be included **only once**.
#### #pragma once
There is the easy but non standard approach that is nonetheless [widely supported](https://en.wikipedia.org/wiki/Pragma_once#Portability) by compilers (`2c-Demo/07-PragmaOnce`):
%% Cell type:code id: tags:
``` C++17
``` c++
// File foo.hpp
#pragma once
class Foo
{ };
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File bar.hpp
#pragma once
#include "foo.hpp"
struct Bar
{
Foo foo_;
};
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File main.cpp
#include <cstdlib>
#include "foo.hpp"
#include "bar.hpp"
int main()
{
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
This prevents the inclusion of `foo.hpp` twice; and now `clang++ -std=c++17 main.cpp -o do_nothing` compiles correctly.
%% Cell type:markdown id: tags:
#### Header guards
The "official" way to protect files - the use of so-called **header guards** - fully supported by the standard, is much more clunky (`2c-Demo/08-HeaderGuards`):
%% Cell type:code id: tags:
``` C++17
``` c++
// File foo.hpp
#ifndef FOO_HPP // If this macro is not yet defined, proceed to the rest of the file.
#define FOO_HPP // Immediately define it so next call won't include again the file content.
class Foo
{ };
#endif // FOO_HPP // End of the macro block that begun with #ifndef
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File bar.hpp
#ifndef BAR_HPP // If this macro is not yet defined, proceed to the rest of the file.
#define BAR_HPP // Immediately define it so next call won't include again the file content.
#include "foo.hpp"
struct Bar
{
Foo foo_;
};
#endif // BAR_HPP // End of the macro block that begun with #ifndef
```
%% Cell type:markdown id: tags:
You may check that `clang++ -std=c++17 main.cpp -o do_nothing` compiles properly as well.
%% Cell type:markdown id: tags:
##### **[WARNING]** Ensure unicity of header guards
There is however a catch with header guards: you must ensure that the macro for a given file is used only once. Let's consider the previous case, but with a bug (`2c-Demo/09-HeaderGuardsBug`):
%% Cell type:code id: tags:
``` C++17
``` c++
// File foo.hpp
#ifndef FOO_HPP // If this macro is not yet defined, proceed to the rest of the file.
#define FOO_HPP // Immediately define it so next call won't include again the file content.
class Foo
{ };
#endif // FOO_HPP // End of the macro block that begun with #ifndef
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File bar.hpp
#ifndef FOO_HPP // bug here!
#define FOO_HPP
#include "foo.hpp"
struct Bar
{
Foo foo_;
};
#endif // FOO_HPP
```
%% Cell type:markdown id: tags:
`clang++ -std=c++17 main.cpp` does not compile, with the terse message:
```shell
main.cpp:7:5: error: unknown type name 'Bar'
Bar bar;
```
%% Cell type:markdown id: tags:
And in a more developed code, it might be a nightmare to identify this kind of bug...
A common strategy is to define a header guard name based on the location of the source file in the tree; this circumvent the case in which two files share a same name (quite common in a large codebase...)
One of us (Sébastien) uses up a [Python script](https://gitlab.inria.fr/MoReFEM/CoreLibrary/MoReFEM/raw/master/Scripts/header_guards.py) which iterates through all the C++ files in his library, identify the header guards of each header file and check they are a mix of the project name and the path of the file. This is definitely much more clunky than **#pragma once** !
But as we said the latter is non standard and there are hot discussions about whether it is safe or not for all set-ups (at some point it was complicated to use if there were symbolic or hard links in the project).
%% Cell type:markdown id: tags:
### Header files extensions
The most current header files extensions are **.hpp**, **.h**, **.hh** and more seldom **.hxx**. I definitely do not recommend **.h**: this is also the extension used for C header files, and some compiler even issue a warning if you're using it in a C++ context.
#### My personal convention (Sébastien)
Personally I am using both **.hpp** and **.hxx**:
* **.hpp** is for the declaration of functions, classes, and so on.
* **.hxx** is for the definitions of inline functions and templates.
The **.hxx** is included at the end of **.hpp** file; this way:
* End-user just includes the **.hpp** files in his code; he **never** needs to bother about including **.hxx** or not.
* The **hpp** file is not too long and includes only declarations with additionally Doxygen comments to explain the API.
And you may have noticed that standard library headers get no extension at all!
%% Cell type:markdown id: tags:
## Why a build system: very basic CMake demonstration
Let's take back our mighty "Hello world" example with a slight extension: we want to query the identity of the user and print that instead. We will foolishly add this new function in yet another file for the sake of illustration only (`2c-Demo/10-CMake`):
%% Cell type:code id: tags:
``` C++17
``` c++
// File hello.hpp
#ifndef HELLO_HPP
#define HELLO_HPP
void Hello();
#endif // HELLO_HPP
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File who-are-you.hpp
#ifndef WHO_ARE_YOU_H
#define WHO_ARE_YOU_H
#include <string>
std::string WhoAreYou();
#endif // WHO_ARE_YOU_H
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File hello.cpp
#include <iostream>
#include "hello.hpp"
#include "who-are-you.hpp"
void hello()
{
auto identity = WhoAreYou();
std::cout << "Hello " << identity << '!' << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File who-are-you.cpp
#include <iostream>
#include "who-are-you.hpp"
std::string WhoAreYou()
{
std::string name;
std::cout << "What's your name? ";
std::cin >> name; // not much safety here but this is not the current point!
return name;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File main.cpp
#include <cstdlib> // For EXIT_SUCCESS
#include "hello.hpp"
int main(int argc, char** argv)
{
Hello();
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
Up to now, we compiled such a program with manually:
%% Cell type:code id: tags:
``` C++17
``` c++
// In terminal
clang++ -std=c++17 -c hello.cpp
clang++ -std=c++17 -c main.cpp
clang++ -std=c++17 -c who-are-you.cpp
clang++ -std=c++17 *.o -o hello
```
%% Cell type:markdown id: tags:
The issue with that is that it's not robust at all: either you recompile everything all the time (and let's face it: it's tedious even with our limited number of files...) or you have to keep track of which should be recompiled. For instance if `who-are-you.hpp` is modified all source files include it and must be recompiled, but if it is `hello.hpp` `who_are_you.cpp` is not modified.
It is to handle automatically this and limit the compilation to only what is required that build systems (which we talked about briefly [here](./1-SetUpEnvironment.ipynb#Build-system)) were introduced. Let's see a brief CMake configuration file named by convention `CMakeLists.txt`:
%% Cell type:code id: tags:
``` C++17
``` c++
# CMakeLists.txt
# Ensure the cmake used is compatible with the CMake functions that are used
cmake_minimum_required(VERSION 3.20)
# A project name is mandatory, preferably right after cmake_minimum_required call
project(Hello)
set(CMAKE_CXX_STANDARD 17 CACHE STRING "C++ standard; at least 17 is expected.")
add_executable(hello
main.cpp
hello.cpp
who-are-you.cpp)
```
%% Cell type:code id: tags:
``` C++17
``` c++
// In terminal
mkdir build // create a directory to separate build from source files and so on
cd build
cmake .. // will create the Makefile; as no generator was provided with -G Unix makefile is chosen.
// The directory indicated by .. MUST include the main CMakeLists.txt of the project.
make
```
%% Cell type:markdown id: tags:
This command creates the executable in current directory; now if we modified one file the build system will rebuild all that needs it and nothing more.
%% Cell type:markdown id: tags:
If `main.cpp` and `hello.cpp` may also be used jointly for another executable, they may be put together in a library; replace the former `add_executable` command by:
%% Cell type:code id: tags:
``` C++17
``` c++
add_library(hello_lib
SHARED
hello.cpp
who-are-you.cpp)
add_executable(hello
main.cpp)
target_link_libraries(hello
hello_lib)
```
%% Cell type:markdown id: tags:
SHARED may be replaced by STATIC to use a static library instead.
%% Cell type:markdown id: tags:
You can run these commands directly with the ! symbol as follows:
%% Cell type:code id: tags:
``` C++17
``` c++
!cd ./2c-Demo/7-CMake/ && mkdir build && cd build && cmake .. && make
```
%% Cell type:code id: tags:
``` C++17
``` c++
!cd ./2c-Demo/7-CMake/build && ./hello
```
%% Cell type:markdown id: tags:
## Where should the headers be included?
* Each time a header is modified, all the source files that include it directly or indirectly are recompiled.
* Each time a source file is modified, only this source file is modified; some relinking for the libraries and executables that depend on it will also occur (linking is the step that glue together the object files and libraries; the term _compilation_ is often - included in this very tutorial - abusively used to encompass both compilation and link phases).
Thus it might seem a good idea to put as much as possible `#include` directives in the source files **rather than in include files**... hence limiting the compilation time. This is a generally very good advice... provided we do not err on the wrong side and put enough in the header file:
%% Cell type:code id: tags:
``` C++17
``` c++
// File foo.hpp
#ifndef FOO_HPP
# define FOO_HPP
#include <string>
void Print(std::string text);
#endif // FOO_HPP
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File foo.cpp
#include <iostream>
#include "foo.hpp"
void Print(std::string text)
{
std::cout << "The text to be printed is: \"" << text << "\"." << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File main.cpp
#include <cstdlib>
#include "foo.hpp"
int main()
{
Print("Hello world!");
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
You may have noticed `string` and `iostream` are not dealt with the same way... and rightly so:
* `#include <iostream>` is only in the source file: it is actually needed only for `std::cout` and `std::endl`, which are implementation details of `Print()` function: neither appears in the signature of the function.
* `#include <string>` is present in `foo.hpp` as it is required to give the information about the type of the prototype to be used. If you do not do that, each time you include `foo.hpp` you would need to include as well `string`; doing so leads to unmaintainable code as you would have to track down all the includes that are required with each include...
So to put in a nutshell:
* Put in the header files all the includes that are mandatory to make the prototypes understandable. A rule of thumb is that a source file that would only include the header file should be compilable:
%% Cell type:code id: tags:
``` C++17
``` c++
// File foo.hpp
std::string Print();
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File check_foo.cpp
#include <cstdlib>
#include "foo.hpp"
int main(int, char**)
{
return EXIT_SUCCESS;
} // DOES NOT COMPILE => header is ill-formed!
```
%% Cell type:markdown id: tags:
* Include that are here for implementation details should on the other hand be preferably in source files. Of course, you may not be able to do that in any case: for instance templates are by construction defined in header files!
%% Cell type:markdown id: tags:
Some tools such as [include-what-you-use](https://include-what-you-use.org/) are rather helpful to help cut off the unrequired includes in file, but they need a bit of time to configure and set up properly, especially on an already large codebase.
%% Cell type:markdown id: tags:
## Forward declaration
There is actually an exception to the first rule I've just given: **forward declaration**. This is really a trick that may be used to reduce compilation time, with some caveats.
The idea is that if a type intervenes in a header file **only as a reference and/or as a (smart) pointer**, it might be forward-declared: its type is merely given in the header (`2c-Demo/11-Forward`)
%% Cell type:code id: tags:
``` C++17
``` c++
// File foo.hpp
#ifndef FOO_HPP
# define FOO_HPP
// Forward declaration: we say a class Bar is meant to exist...
class Bar;
struct Foo
{
Foo(int n);
void Print() const;
Bar* bar_ = nullptr;
};
#endif // FOO_HPP
```
%% Cell type:code id: tags:
``` C++17
``` c++
// File check_header_ok.cpp
#include <cstdlib>
#include "foo.hpp"
int main(int, char**)
{
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
and `clang++ -std=c++17 check_header_ok.cpp` compiles properly (you may try commenting out the forward declaration line to check it does not without it)
This is not without cost: obviously in a file where `Bar` is actually needed you will need to include it properly: with just `#include "foo.hpp"` you can't for instance call a method of `Bar` class.
Typically the `include "bar.hpp"` will be located in the `foo.cpp` file, in which you will probably need the `Bar` object interface to define your `Foo` object (or if not you may question why you chose to put the `bar_` data attribute in the first place)
It is nonetheless a very nice trick to know; there is even an idiom call [Pimpl idiom](https://arne-mertz.de/2019/01/the-pimpl-idiom/) that relies upon forward declaration.
This is however not the only use for it though: to define a shared_ptr/weak_ptr you [also need](../7-Appendix/WeakPtr.ipynb) to use this capability.
The tool [include-what-you-use](https://include-what-you-use.org/) mentioned earlier is able to suggest as well what should be forward-declared.
%% Cell type:markdown id: tags:
[© Copyright](../COPYRIGHT.md)
......
......@@ -82,16 +82,15 @@
],
"metadata": {
"kernelspec": {
"display_name": "C++17",
"language": "C++17",
"name": "xcpp17"
"display_name": "Cppyy",
"language": "c++",
"name": "cppyy"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"codemirror_mode": "c++",
"file_extension": ".cpp",
"mimetype": "text/x-c++src",
"name": "c++",
"version": "17"
"name": "c++"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
......
......@@ -226,16 +226,15 @@
],
"metadata": {
"kernelspec": {
"display_name": "C++17",
"language": "C++17",
"name": "xcpp17"
"display_name": "Cppyy",
"language": "c++",
"name": "cppyy"
},
"language_info": {
"codemirror_mode": "text/x-c++src",
"codemirror_mode": "c++",
"file_extension": ".cpp",
"mimetype": "text/x-c++src",
"name": "c++",
"version": "17"
"name": "c++"
},
"latex_envs": {
"LaTeX_envs_menu_present": true,
......