Commit 9e4c8cce authored by GILLES Sebastien's avatar GILLES Sebastien
Browse files

Add explanation about c_str(); also commit without cells executed for the notebook.

parent 2a5f9cb4
%% Cell type:markdown id: tags:
# [Getting started in C++](/) - [Procedural programming](/notebooks/1-ProceduralProgramming/0-main.ipynb) - [Predefined types](/notebooks/1-ProceduralProgramming/3-Types.ipynb)
%% Cell type:markdown id: tags:
<h1>Table of contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Boolean" data-toc-modified-id="Boolean-1">Boolean</a></span></li><li><span><a href="#Enumerations" data-toc-modified-id="Enumerations-2">Enumerations</a></span><ul class="toc-item"><li><span><a href="#Historical-enumerations" data-toc-modified-id="Historical-enumerations-2.1">Historical enumerations</a></span></li><li><span><a href="#New-enumerations" data-toc-modified-id="New-enumerations-2.2">New enumerations</a></span></li></ul></li><li><span><a href="#Numerical-types" data-toc-modified-id="Numerical-types-3">Numerical types</a></span><ul class="toc-item"><li><ul class="toc-item"><li><span><a href="#List-of-numerical-types" data-toc-modified-id="List-of-numerical-types-3.0.1">List of numerical types</a></span></li><li><span><a href="#Numeric-limits" data-toc-modified-id="Numeric-limits-3.0.2">Numeric limits</a></span></li><li><span><a href="#Conversions-between-digital-types" data-toc-modified-id="Conversions-between-digital-types-3.0.3">Conversions between digital types</a></span></li></ul></li><li><span><a href="#Explicit-conversions-inherited-from-C" data-toc-modified-id="Explicit-conversions-inherited-from-C-3.1">Explicit conversions inherited from C</a></span></li><li><span><a href="#Explicit-conversions-by-static_cast" data-toc-modified-id="Explicit-conversions-by-static_cast-3.2">Explicit conversions by static_cast</a></span></li><li><span><a href="#Other-explicit-conversions" data-toc-modified-id="Other-explicit-conversions-3.3">Other explicit conversions</a></span></li></ul></li><li><span><a href="#Characters-and-strings" data-toc-modified-id="Characters-and-strings-4">Characters and strings</a></span><ul class="toc-item"><li><span><a href="#Historical-strings" data-toc-modified-id="Historical-strings-4.1">Historical strings</a></span></li><li><span><a href="#std::string" data-toc-modified-id="std::string-4.2">std::string</a></span></li></ul></li><li><span><a href="#Renaming-types" data-toc-modified-id="Renaming-types-5">Renaming types</a></span></li><li><span><a href="#decltype-and-auto" data-toc-modified-id="decltype-and-auto-6"><code>decltype</code> and <code>auto</code></a></span></li></ul></div>
%% Cell type:markdown id: tags:
## Boolean
Variables with type `bool` may be set to true or false.
It should be noted that this type did not originally exist, and that C++ instructions with conditions do not necessarily expect boolean values, but rather integers.
There is a form of equivalence between booleans and integers: any null integer is equivalent to `false`, and any other value is equivalent to `true`.
%% Cell type:code id: tags:
``` C++17
#include <iostream>
{
bool a = true, b(false), c{true};
bool d; // UNDEFINED !!
if (d)
std::cout << "This text might appear or not - it's truly undefined and may vary from "
"one run/compiler/architecture/etc... to another!" << std::endl;
int n = 5;
if (n)
std::cout << "Boolean value of " << n << " is true." << std::endl;
n = 0;
if (!n) // ! is the not operator: the condition is true if n is false.
std::cout << "Boolean value of " << n << " is false." << std::endl;
n = -2;
if (n)
std::cout << "Boolean value of " << n << " is true." << std::endl;
}
```
%% Cell type:markdown id: tags:
## Enumerations
### Historical enumerations
The historical enumerations `enum` of C++ allow to define constants that are treated as integers, and that can be initialized from integers. By default the first value is 0 and the `enum` is incremented for each value, but it is possible to bypass these default values and provide the desired numerical value yourself.
%% Cell type:code id: tags:
``` C++17
#include <iostream>
{
enum color { red, green, blue } ;
std::cout << red << " " << green << " " << blue << " (expected: 0, 1, 2)" << std::endl;
enum shape { circle=10, square, triangle=20 };
std::cout << circle << " " << square << " " << triangle << " (expected: 10, 11, 20)"<< std::endl; // 10 11 20
}
```
%% Cell type:markdown id: tags:
These `enum` are placeholders for integers and might be used as such:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
{
enum color { red, green, blue } ;
int a { 5 };
color c = green;
int b = a + c;
std::cout << "b = " << b << " (expected: 6)" << std::endl;
enum shape { circle=10, square, triangle=20 };
shape s = triangle;
int d = s + c;
std::cout << "d = " << d << " (expected: 21... but we've just added a shape to a color without ado!)" << std::endl;
}
```
%% Cell type:markdown id: tags:
A shortcoming of historical `enum ` is that the same word can't be used in two different `enum`:
%% Cell type:code id: tags:
``` C++17
{
enum is_positive { yes, no };
enum is_colored { yes, no }; // COMPILATION ERROR!
}
```
%% Cell type:markdown id: tags:
### New enumerations
To overcome the two limitations we have just mentioned, C++11 makes it possible to declare new `enum class` enumerations, each constituting a separate type, not implicitly convertible into an integer. This type protects against previous errors at the cost of a little more writing work.
%% Cell type:code id: tags:
``` C++17
enum class is_positive { yes, no };
enum class is_colored { yes, no }; // OK
```
%% Cell type:code id: tags:
``` C++17
yes; // COMPILATION ERROR: `enum class ` value must be prefixed! (see below)
```
%% Cell type:code id: tags:
``` C++17
is_positive p = is_positive::yes; // OK
```
%% Cell type:code id: tags:
``` C++17
int is_positive_p = is_positive::no; // COMPILATION ERROR: not implicitly convertible into an integer
```
%% Cell type:code id: tags:
``` C++17
is_positive::yes + is_colored::no; // COMPILATION ERROR: addition of two unrelated types
```
%% Cell type:code id: tags:
``` C++17
{
enum class color { red, green, blue } ;
color c = color::green;
bool is_more_than_red = (c > color::red); // Both belong to the same type and therefore might be compared
}
```
%% Cell type:markdown id: tags:
These enum types are really handy to make code more expressive, especially in function calls:
````
f(print::yes, perform_checks::no);
````
is much more expressive (and less error-prone) than:
````
f(true, false);
````
for which you will probably need to go check the prototype to figure out what each argument stands for.
%% Cell type:markdown id: tags:
## Numerical types
#### List of numerical types
The FORTRAN correspondences below are given as examples. The
size of the C++ digital types can vary depending on the processor used. The
standard C++ only imposes `short <= int <= long` and `float <= double <= long double`. This makes these predefined types unportable. Like many things
in C, and therefore in C++, performance is given priority over any other consideration.
The default integer and real types, `int` and `double`, are assumed
to match the size of the processor registers and be the fastest (for more details see [the article on cppreference](http://en.cppreference.com/w/cpp/language/types))
| C++ | Fortran | Observations | 0 notation |
|:------------- |:---------:|:-------------------:|:----------:|
| `short` | INTEGER*2 | At least on 16 bits | None |
| `int` | INTEGER*4 | At least on 16 bits | 0 |
| `long` | INTEGER*8 | At least on 32 bits | 0l |
| `long long` | INTEGER*16| At least on 64 bits | 0ll |
| `float` | REAL*4 | - | 0.f |
| `double` | REAL*8 | - | 0. |
| `long double` | REAL*16 | - | 0.l |
All integer types (`short`, `int` and `long`) also have an unsigned variant, for example
`unsigned int`, which only takes positive values.
It should also be noted that the type `char` is the equivalent of one byte,
and depending on the context will be interpreted as a number or as a
character.
If you need an integer type of a defined size, regardless of the type of processor or platform used, you should use those already defined in `<cstdint>` for C++11 (for more details click [here](http://en.cppreference.com/w/cpp/types/integer)).
The _0 notation column_ is the way to notice explicitly the type in an expression; of course any value might be used instead of 0. A `u` might be used to signal the unsigned status for integer types; for instance `3ul` means 3 as an _unsigned long_. `auto` notation below will illustrate a case in which such a notation is useful.
#### Numeric limits
Always keep in mind the types of the computer don't match the abstract concept you may use in mathematics... The types stored especially don't go from minus infinity to infinity:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
#include <limits> // for std::numeric_limits
{
std::cout << "int [min, max] = [" << std::numeric_limits<int>::lowest() << ", "
<< std::numeric_limits<int>::max() << "]" << std::endl;
std::cout << "unsigned int [min, max] = [" << std::numeric_limits<unsigned int>::lowest() << ", "
<< std::numeric_limits<unsigned int>::max() << "]" << std::endl;
std::cout << "short [min, max] = [" << std::numeric_limits<short>::lowest() << ", "
<< std::numeric_limits<short>::max() << "]" << std::endl;
std::cout << "long [min, max] = [" << std::numeric_limits<long>::lowest() << ", "
<< std::numeric_limits<long>::max() << "]" << std::endl;
std::cout << "float [min, max] = [" << std::numeric_limits<float>::lowest() << ", "
<< std::numeric_limits<float>::max() << "]" << std::endl;
std::cout << "double [min, max] = [" << std::numeric_limits<double>::lowest() << ", "
<< std::numeric_limits<double>::max() << "]" << std::endl;
std::cout << "long double [min, max] = [" << std::numeric_limits<long double>::lowest() << ", "
<< std::numeric_limits<long double>::max() << "]" << std::endl;
}
```
%% Cell type:markdown id: tags:
If an initial value is not in the range, the compiler will yell:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
{
short s = -33010; // triggers a warning: outside the range
std::cout << s << std::endl;
}
```
%% Cell type:markdown id: tags:
However, if you go beyond the numeric limit during a computation you're on your own:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
#include <limits> // for std::numeric_limits
{
unsigned int max = std::numeric_limits<unsigned int>::max();
std::cout << "Max = " << max << std::endl;
std::cout << "Max + 1 = " << max + 1 << "!" << std::endl;
}
```
%% Cell type:markdown id: tags:
When you reach the end of a type, a modulo is actually applied to make put it back into the range!
Don't worry, for most computations you shouldn't run into this kind of trouble, but if you are dealing with important values it is important to keep in mind this kind of issues.
The most obvious way to avoid this is to choose appropriate types: if your integer might be huge a `long` is more appropriate than an `int`.
Other languages such as Python gets a underlying integer model that is resilient to this kind of issue but there is a cost behind; types such as those used in C++ are tailored to favor optimization on your hardware.
%% Cell type:markdown id: tags:
#### Conversions between digital types
In C++11, there is a difference in compiler behavior between initialization and assignment. Accuracy losses are allowed during an assignment, but not during an initialization between braces:
%% Cell type:code id: tags:
``` C++17
{
float f = 1.12345678901234567890;
double d = 2.12345678901234567890;
float f_d(d);
float f_dd = d;
}
```
%% Cell type:markdown id: tags:
are ok while the operation below doesn't compile:
%% Cell type:code id: tags:
``` C++17
{
double d = 2.12345678901234567890;
float f_d{d};
}
```
%% Cell type:markdown id: tags:
Accuracy losses are detected during conversion:
* from a floating point type (`long double`, `double` and `float`) into an integer type.
* from a `long double` into a `double` or a `float`, unless the source is constant and its value fits into the type of the destination.
* from a `double` into a `float`, unless the source is constant and its value fits in the type of the destination.
* from an integer type to an enumerated or floating point type, unless the source is constant and its value fits into the type of the destination.
* from an integer type to an enumerated type or another integer type, unless the source is constant and its value fits into the type of the destination.
%% Cell type:markdown id: tags:
### Explicit conversions inherited from C
In the case of an explicit conversion, the programmer explicitly says which conversion to use.
C++ inherits the forcing mechanism of the C type:
%% Cell type:code id: tags:
``` C++17
{
unsigned short i = 42000 ;
short j = short(i) ;
unsigned short k = (unsigned short)(j) ;
}
```
%% Cell type:markdown id: tags:
It is **not recommended** to use this type of conversion: even if it is clearly faster to type, it is less accurate and does not stand out clearly when reading a code; it is preferable to use the other conversion modes mentioned below.
%% Cell type:markdown id: tags:
### Explicit conversions by static_cast
C++ has also redefined a family of type forcing,
more verbose but more precise. The most common type of explicit conversion is the `static_cast`:
%% Cell type:code id: tags:
``` C++17
{
unsigned short i = 42000;
short j = static_cast<short>(i);
unsigned short k = static_cast<unsigned short>(j);
}
```
%% Cell type:markdown id: tags:
Another advantage of this more verbosy syntax is that you may find it more easily in your code with your editor search functionality.
%% Cell type:markdown id: tags:
### Other explicit conversions
There are 3 other types of C++ conversions:
* `const_cast`, to add or remove constness to a reference or a pointer (obviously to be used with great caution!)
* `dynamic_cast`, which will be introduced when we'll deal with polymorphism.
* `reinterpret_cast`, which is a very brutal cast which changes the type into any other type, regardless of the compatibility of the two types considered. It is a dangerous one that should be considered only in very last resort (usually when interacting with a C library).
%% Cell type:markdown id: tags:
## Characters and strings
### Historical strings
In C, a character string is literally an array of `char` variables, the last character of which is by convention the symbol `\0`.
The `strlen` function returns the length of a string, which is the number of characters between the very first character and the first occurrence of `\0`.
The `strcpy` function copies a character string to a new memory location; care must be taken to ensure that the destination is large enough to avoid any undefined behavior.
The `strncpy` function allows you to copy only the first <b>n</b> first characters, where <b>n</b> is the third parameter of the function. Same remark about the need to foresee a large enough destination.
%% Cell type:code id: tags:
``` C++17
#include <iostream>
#include <cstring> // For strlen, strcpy, strncpy
{
char bonjour[] = {'b','o','n','j','o','u','r','\0'};
char coucou[] = "coucou";
char* salut = "salut"; // To avoid in C++ (see warning)
char copy[10] = {}; // = {'\0','\0','\0','\0','\0','\0','\0','\0','\0','\0'};
strcpy(copy, bonjour);
std::cout << "String '" << copy << "' is " << strlen(copy) << " characters long." << std::endl;
strncpy(copy, coucou, strlen(coucou));
copy[strlen(coucou)] = '\0'; // Don't forget to terminate the string!
std::cout << "String '" << copy << "' is " << strlen(copy) << " characters long." << std::endl;
}
```
%% Cell type:markdown id: tags:
There are several other functions related to historical strings; for more information, do not hesitate to consult [this reference page](http://www.cplusplus.com/reference/cstring/).
%% Cell type:markdown id: tags:
### std::string
In modern C++, rather than bothering with character tables
which come from the C language, it's easier to use the type `std::string`, provided
through the standard language library:
through the standard language library, that provides a much simpler syntax:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
#include <cstring> // For strlen
#include <string> // For std::string
{
const char* bonjour_str = "bonjour";
std::string bonjour = bonjour_str;
std::string salut("salut");
std::string coucou {"coucou"};
std::string copy {};
copy = bonjour;
std::cout << "String '" << copy << "' is " << copy.length() << " characters long." << std::endl;
copy = coucou; // please notice affectation is much more straightforward
std::cout << "String '" << copy << "' is " << copy.length() << " characters long." << std::endl;
const char* copy_str = bonjour.data(); // Returns a classic C-string (from C++11 onward)
const char* old_copy_str = &bonjour[0]; // Same before C++11...
std::cout << "String '" << copy_str << "' is " << strlen(copy_str) << " characters long." << std::endl;
std::string dynamic {"dynamic std::string"};
std::cout << "String '" << dynamic << "' is " << dynamic.length() << " characters long." << std::endl;
dynamic = "std::string is dynamical and flexible";
std::cout << "String '" << dynamic << "' is " << dynamic.length() << " characters long." << std::endl;
}
```
%%%% Output: stream
%% Cell type:markdown id: tags:
If needed (for instance to interact with a C library) you may access to the underlying table with `c_str()`:
%% Cell type:code id: tags:
String 'bonjour' is 7 characters long.
String 'coucou' is 6 characters long.
String 'bonjour' is 7 characters long.
String 'dynamic std::string' is 19 characters long.
String 'std::string is dynamical and flexible' is 37 characters long.
``` C++17
#include <string>
{
std::string cplusplus_string("C++ string!");
const char* c_string = cplusplus_string.c_str(); // notice the `const`
}
```
%% Cell type:markdown id: tags:
The `const` here is important: you may access the content but should not modify it; this functionality is provided for read-only access.
%% Cell type:markdown id: tags:
Please notice C++17 introduced [std::string_view](https://en.cppreference.com/w/cpp/header/string_view) which is more efficient than `std::string` for some operations; it is however out of the scope of this lecture.
%% Cell type:markdown id: tags:
## Renaming types
Sometimes it may be handy to rename a type, for instance if you want to be able to change easily throughout the code the numeric precision to use. Historical syntax (up to C++ 11 and still valid) was `typedef`:
%% Cell type:code id: tags:
``` C++17
#include <iostream>
#include <iomanip> // For std::setprecision
{
typedef double real; // notice the ordering: new typename comes after its value
real radius {1.};
real area = 3.1415926535897932385 * radius * radius;
std::cout <<"Area = " << std::setprecision(15) << area << std::endl;
}
```
%%%% Output: stream
Area = 3.14159265358979
%% Cell type:markdown id: tags:
In more modern C++ (C++11 and above), another syntax relying on `using` keyword was introduced; it is advised to use it as this syntax is more powerful in some contexts (see later with templates...):
%% Cell type:code id: tags:
``` C++17
#include <iostream>
#include <iomanip> // For std::setprecision
{
using real = float; // notice the ordering: more in line with was we're accustomed to when
// initialising variables.
real radius {1.};
real area = 3.1415926535897932385 * radius * radius;
std::cout <<"Area = " << std::setprecision(15) << area << std::endl;
}
```
%%%% Output: stream
Area = 3.14159274101257
%% Cell type:markdown id: tags:
## `decltype` and `auto`
C++ 11 introduced new keywords that are very handy to deal with types:
* `decltype` which is able to determine at compile time the underlying type of a variable.
* `auto` which determines automatically the type of an expression.
%% Cell type:code id: tags:
``` C++17
#include <vector>
{
auto i = 5; // i is here an int.
auto j = 5u; // j is an unsigned int
decltype(j) k; // decltype(j) is interpreted by the compiler as an unsigned int.
}
```
%% Cell type:markdown id: tags:
On such trivial examples it might not seem much, but in practice it might prove incredibly useful. Consider for instance the following C++03 code (the details don't matter: we'll deal with `std::vector` in a [later notebook](/notebooks/5-UsefulConceptsAndSTL/3-Containers.ipynb)):
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<unsigned int> primes { 2, 3, 5, 7, 11, 13, 17, 19 }; // I'm cheating: it's C++ 11 notation...
// In C++ 03 you would have to push_back one by one each of the element!
for (std::vector<unsigned int>::const_iterator it = primes.cbegin();
it != primes.cend();
++it)
{
std::cout << *it << " is prime." << std::endl;
}
}
```
%%%% Output: stream
2 is prime.
3 is prime.
5 is prime.
7 is prime.
11 is prime.
13 is prime.
17 is prime.
19 is prime.
%% Cell type:markdown id: tags:
It's very verbosy; we could of course use alias:
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<unsigned int> primes { 2, 3, 5, 7, 11, 13, 17, 19 }; // I'm cheating: it's C++ 11 notation...
using iterator = std::vector<unsigned int>::const_iterator;
for (iterator it = primes.cbegin();
it != primes.cend();
++it)
{
std::cout << *it << " is prime." << std::endl;
}
}
```
%%%% Output: stream
2 is prime.
3 is prime.
5 is prime.
7 is prime.
11 is prime.
13 is prime.
17 is prime.
19 is prime.
%% Cell type:markdown id: tags:
But with `decltype` we may write instead:
%% Cell type:code id: tags:
``` C++17
#include <vector>
#include <iostream>
{
std::vector<unsigned int> primes { 2, 3, 5, 7, 11, 13, 17, 19 }; // I'm cheating: it's C++ 11 notation...
for (decltype(primes.cbegin()) it = primes.cbegin();
it != primes.cend();
++it)
{
std::cout << *it << " is prime." << std::endl;
}
}
```