Commit c34b0267 authored by GILLES Sebastien's avatar GILLES Sebastien
Browse files

Cosmetics.

parent 0cc2b24d
%% Cell type:markdown id: tags:
# [Getting started in C++](/) - [C++ in a real environment](/notebooks/6-InRealEnvironment/0-main.ipynb) - [File structure in a C++ program](/notebooks/6-InRealEnvironment/2-FileStructure.ipynb)
%% Cell type:markdown id: tags:
<h1>Table of contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Library-and-program" data-toc-modified-id="Library-and-program-1">Library and program</a></span><ul class="toc-item"><li><span><a href="#Static-and-shared-libraries" data-toc-modified-id="Static-and-shared-libraries-1.1">Static and shared libraries</a></span></li></ul></li><li><span><a href="#Source-file" data-toc-modified-id="Source-file-2">Source file</a></span><ul class="toc-item"><li><span><a href="#Compilation-of-Hello-world!" data-toc-modified-id="Compilation-of-Hello-world!-2.1">Compilation of <em>Hello world!</em></a></span></li><li><span><a href="#Source-files-extensions" data-toc-modified-id="Source-files-extensions-2.2">Source files extensions</a></span></li><li><span><a href="#Expanding-our-hello-program-with-two-source-files:-one-for-main,-one-for-the-function" data-toc-modified-id="Expanding-our-hello-program-with-two-source-files:-one-for-main,-one-for-the-function-2.3">Expanding our hello program with two source files: one for main, one for the function</a></span></li></ul></li><li><span><a href="#Header-file" data-toc-modified-id="Header-file-3">Header file</a></span><ul class="toc-item"><li><span><a href="#Header-location" data-toc-modified-id="Header-location-3.1">Header location</a></span></li><li><span><a href="#&quot;&quot;--or-<>?" data-toc-modified-id="&quot;&quot;--or-<>?-3.2"><code>""</code> or <code>&lt;&gt;</code>?</a></span></li><li><span><a href="#Header-guards" data-toc-modified-id="Header-guards-3.3">Header guards</a></span></li><li><span><a href="#Header-files-extensions" data-toc-modified-id="Header-files-extensions-3.4">Header files extensions</a></span><ul class="toc-item"><li><span><a href="#My-personal-convention" data-toc-modified-id="My-personal-convention-3.4.1">My personal convention</a></span></li></ul></li></ul></li><li><span><a href="#Why-a-build-system:-very-basic-CMake-demonstration" data-toc-modified-id="Why-a-build-system:-very-basic-CMake-demonstration-4">Why a build system: very basic CMake demonstration</a></span></li><li><span><a href="#Where-should-the-headers-be-included?" data-toc-modified-id="Where-should-the-headers-be-included?-5">Where should the headers be included?</a></span></li><li><span><a href="#Forward-declaration" data-toc-modified-id="Forward-declaration-6">Forward declaration</a></span></li></ul></div>
%% Cell type:markdown id: tags:
## Library and program
Contrary to for instance Python or Ruby, C++ is not a scripting language: it is intended to build either an **executable** or **library**.
To summarize:
* An **executable** runs the content of the [`main() function`](http://localhost:8888/notebooks/1-ProceduralProgramming/4-Functions.ipynb#A-very-special-function:-main). There should be exactly one such function in all the compiled files; the file with this `main` must be compiled.
* A **library** is a collection of function, classes and so on that might be used in a program. A library may be **header-only**: in this case it is just an ensemble of header files with no file compiled. In this case all the definitions must be either **inline** or **template**.
* A **library** is a collection of functions, classes and so on that might be used in a program. A library may be **header-only**: in this case it is just an ensemble of header files with no file compiled. In this case all the definitions must be either **inline** or **template**.
### Static and shared libraries
A (non header) library may be constructed as one of the following type:
* A **static** library, usually with a **.a** extension, is actually included directly into any executable that requires it. The advantage is that you just need the bare executable to run your code: the library is no longer required at runtime. The inconvenient is that the storage space may balloon up rather quickly: each executable will contain the whole library!
* A **shared** library, which extension may vary wildly from one OS to another (**.dylib**, **.so**, **.dll**, etc...), is on the other hand required at runtime by the executable that was built with it. The advantage is that executables are thus much smaller. They are often described on the Web as the way to go; my personal experience with them is however less rosy as each OS handles them differently (noticeably the way to indicate in which location the dynamic libraries should be looked at differ rather wildly...)
The best if possible is to enable generation of your library in either type... but it is a bit of work in your build system.
## Source file
Contrary to most of more modern languages, C++ relies upon two very specific kind of files, each of which with their own extension schemes. We will introduce first the source file, with which basic programs might be achieved, and then show why header files are also needed.
### Compilation of _Hello world!_
A source file is a type of file intended to be **compiled**.
Let's consider the seminal _Hello world_ in a dedicated source file named _hello.cpp_:
%% Cell type:code id: tags:
``` C++17
// File hello.cpp - I put "Code" as cell type in Jupyter to get nice colors but it's not intended
// to be executed in the cell!
#include <iostream>
int main(int argc, char** argv)
{
std::cout << "Hello world!" << std::endl;
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
To compile it on a Unix system, you will need to type in your terminal a line that looks like (with at least [GNU compiler for C++](https://en.wikipedia.org/wiki/GNU_Compiler_Collection) and [clang++](https://en.wikipedia.org/wiki/Clang)):
%% Cell type:code id: tags:
``` C++17
// In a terminal
g++ -std=c++17 hello.cpp -o hello
```
%% Cell type:markdown id: tags:
where:
- `g++` is the name of the compiler. You may provide clang++ if you wish.
- `-std=c++17` tells to use this version of the standard (-std=c++20 begins to appear but is not yet published as its name indicates...). If not specified the compilers tend to assume C++ 11 but may issue warnings if some features introduced with this standard are used.
- `hello.cpp` is the name of the source file.
- `hello` is the name of the executable produced. If the `-o hello` is omitted, the executable is arbitrarily named `a.out`, exactly as in C.
%% Cell type:markdown id: tags:
The executable may then be used with:
%% Cell type:code id: tags:
``` C++17
// In a terminal
./hello
```
%% Cell type:markdown id: tags:
The `./` is there to specify the executable is to be looked at in current path; it may be omitted if `.` is present in the system `PATH` environment variable.
Please notice the name of the file with the `main()` function and the name of the executable are completely custom; you have no requirement on the names of files and executable.
%% Cell type:markdown id: tags:
### Source files extensions
The plural is not a mistake: unfortunately, contrary to many languages, there is no universal convention upon the extension to use for C++ files. There are widely spread conventions, but a library may choose not to follow them.
Editors and IDE know the most common ones and usually provide a way to add your own spin so that they may provide language recognition and all that goes with it (colored syntax, completion helper and so on).
The most common extensions are **.cpp**, **.cc**, **.C** and more seldom **.cxx**.
My advice would be to choose one and stick to it; the only one I warn against is **.C** because some operating systems (such as macOS) are case-insensitive by default and **.c** is a more common convention for C programs.
%% Cell type:markdown id: tags:
### Expanding our hello program with two source files: one for main, one for the function
This code is not very subtle: everything is in the same file, so we are in a very simplistic case in which only one file is compiled, and there are no need to find ways to specify how several files relate to each other.
You may imagine working in a single file is not an very common option: it hinders reusability, and it would be cumbersome to navigate in a file with thousands or more lines or code.
We want know to separate the main() and the actual content of the code:
%% Cell type:code id: tags:
``` C++17
// File hello.cpp - no main inside
#include <iostream>
void hello()
{
std::cout << "Hello world!" << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
// File main.cpp
#include <cstdlib> // for EXIT_SUCCESS
int main(int argc, char** argv)
{
hello();
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
This brute force method is not working: a line on a terminal like:
%% Cell type:code id: tags:
``` C++17
// In a terminal
clang++ -std=c++17 hello.cpp main.cpp -o hello
```
%% Cell type:markdown id: tags:
would yield something like:
````verbatim
main.cpp:5:5: error: use of undeclared identifier 'hello'
hello();
^
1 error generated.
````
%% Cell type:markdown id: tags:
## Header file
The issue above is that we need to inform the compiler when it attemps to compile `main.cpp` that `hello()` function is something that exists. We need to **declare** it in a dedicated **header file** and **include** this file in each source file that needs it:
%% Cell type:code id: tags:
``` C++17
// File hello.hpp
void hello();
```
%% Cell type:code id: tags:
``` C++17
// File main.cpp
#include <cstdlib> // for EXIT_SUCCESS
#include "hello.hpp"
int main(int argc, char** argv)
{
hello();
return EXIT_SUCCESS;
}
```
%% Cell type:code id: tags:
``` C++17
// File hello.cpp - no main inside
#include <iostream>
#include "hello.hpp"
void hello()
{
std::cout << "Hello world!" << std::endl;
}
```
%% Cell type:markdown id: tags:
With this few changes, the command line:
%% Cell type:code id: tags:
``` C++17
// In a terminal
clang++ -std=c++17 hello.cpp main.cpp -o hello
```
%% Cell type:markdown id: tags:
works as expected and creates a valid `hello` executable.
%% Cell type:markdown id: tags:
### Header location
In the example above `hello.hpp` was found because it was in the current folder. Let's suppose now we want to put include files in a directory named `incl`; to make it work we have actually two ways:
* Either modifying the path in the source file. We would get
````#include "incl/hello.hpp"```` in both hello.cpp and main.cpp.
* Or by giving to the command line the `-I` instruction to indicate which path to look for:
%% Cell type:code id: tags:
``` C++17
// In a terminal
clang++ -std=c++17 -Iincl hello.cpp main.cpp -o hello
```
%% Cell type:markdown id: tags:
As many `-I` as you wish may be provided on the command line; I would recommend not providing too many as it increases the risk of an ambiguity if two header files at different path are named likewise:
````verbatim
incl/foo.hpp
bar/incl/foo.hpp
````
and
````
clang++ -Iincl -Ibar/incl main.cpp
````
leads to an ambiguity if there is `#include "foo.hpp"` in the `main.cpp`...
%% Cell type:markdown id: tags:
### `""` or `<>`?
You may have noticed I sometimes used `<>` and sometimes `""` to specify the path for the include.
The details don't matter that much in most cases, but it is better to:
* Use `<>` only for the system libraries, typically STL or C headers should be this form.
* Use `""` for your headers or for third-party libraries installed in specific locations.
If you want a bit more details:
* `""` will look first in the current directory, and then in the header files directories.
* `<>` will look only in the header files directories.
%% Cell type:markdown id: tags:
### Header guards
During compilation, the `#include` command is actually replaced by the content of the file which path is provided here. We therefore may quickly include twice the same content:
%% Cell type:code id: tags:
``` C++17
// File foo.hpp
class Foo
{ };
```
%% Cell type:code id: tags:
``` C++17
// File main.cpp
#include <cstdlib>
#include "foo.hpp"
#include "foo.hpp"
int main()
{
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
doesn't compile: the translation unit provides two declarations of class Foo!
This might seem a simple enough mistake to fix it, but in a project with few header files that might be intricated it becomes quickly too much a hassle:
%% Cell type:code id: tags:
``` C++17
// File foo.hpp
class Foo
{ };
```
%% Cell type:code id: tags:
``` C++17
// File bar.hpp
#include "foo.hpp"
struct Bar
{
Foo foo_;
};
```
%% Cell type:code id: tags:
``` C++17
// File main.cpp
#include <cstdlib>
#include "foo.hpp"
#include "bar.hpp" // Compilation error: "foo.hpp" is sneakily included here as well!
int main()
{
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
The patch is to indicate in each header file that it should be included only once. There is the easy but non standard approach I honestly didn't know up to now was [so widely supported](https://en.wikipedia.org/wiki/Pragma_once#Portability) by compilers:
%% Cell type:code id: tags:
``` C++17
// File foo.hpp - Fix 1
#pragma once
class Foo
{ };
```
%% Cell type:markdown id: tags:
And the more tedious one called **header guards** which is fully supported by the standard but much more clunky:
%% Cell type:code id: tags:
``` C++17
#ifndef FOO_H // If this macro is not yet defined, proceed to the rest of the file.
# define FOO_H // Immediately define it so next call won't include again the file content.
class Foo
{ };
#endif // FOO_H // End of the macro block that begun with #ifndef
```
%% Cell type:markdown id: tags:
To make that work in a program, you have to ensure that:
* Each macro name is unique: if `bar.hpp` also defines #ifndef FOO_H, one of the file will never be included!
* The macros should not have been defined elsewhere in another context.
In my code, to ensure the first never happen, I have written a [Python script](https://gitlab.inria.fr/MoReFEM/CoreLibrary/MoReFEM/raw/master/Scripts/header_guards.py) which iterates through all the C++ files in my library, identify the header guards of each header file and check they are a mix of the project name and the path of the file. So definitely much more clunky than **#pragma once** ! But as I said the latter is non standard and there are hot discussions about whether it is safe or not for all set-ups (at some point it was complicated to use if there were symbolic or hard links in the project).
%% Cell type:markdown id: tags:
### Header files extensions
The most current header files extensions are **.hpp**, **.h**, **.hh** and more seldom **.hxx**. I definitely not recommend **.h**: this is also the extension used for C header files, and some compiler even issue a warning if you're using it in a C++ context.
#### My personal convention
Personally I am using both **.hpp** and **.hxx**:
* **.hpp** is for the declaration of functions, classes, and so on.
* **.hxx** is for the definitions of inline functions and templates.
The **.hxx** is included at the end of **.hpp** file; this way:
* End-user just includes the **.hpp** files in his code; he **never** needs to bother about including **.hxx** or not.
* The **hpp** file is not too long and includes only declarations with additionally Doxygen comments to explain the API.
And you may have noticed that standard library headers get no extension at all!
%% Cell type:markdown id: tags:
## Why a build system: very basic CMake demonstration
Let's take back our mighty "Hello world" example with a slight extension: we want to query the identity of the user and print that instead. We will foolishly add this new function in yet another file for the sake of illustration only:
%% Cell type:code id: tags:
``` C++17
// File hello.hpp
#ifndef HELLO_HPP
#define HELLO_HPP
void hello();
#endif // HELLO_HPP
```
%% Cell type:code id: tags:
``` C++17
// File who-are-you.hpp
#ifndef WHO_ARE_YOU_H
#define WHO_ARE_YOU_H
#include <string>
std::string WhoAreYou();
#endif // WHO_ARE_YOU_H
```
%% Cell type:code id: tags:
``` C++17
// File hello.cpp
#include <iostream>
#include "hello.hpp"
#include "who-are-you.hpp"
void hello()
{
auto identity = WhoAreYou();
std::cout << "Hello " << identity << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
// File who-are-you.cpp
#include <iostream>
#include "who-are-you.hpp"
std::string WhoAreYou()
{
std::string name;
std::cout << "What's your name? ";
std::cin >> name;
return name;
}
```
%% Cell type:code id: tags:
``` C++17
// File main.cpp
#include <cstdlib> // For EXIT_SUCCESS
#include "hello.hpp"
int main(int argc, char** argv)
{
hello();
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
Up to now, we compiled such a program with manually:
%% Cell type:code id: tags:
``` C++17
// In terminal
clang++ -std=c++17 -c hello.cpp Err 1 #40
clang++ -std=c++17 -c main.cpp 18:09 #40
clang++ -std=c++17 -c who-are-you.cpp 18:09 #41
clang++ -std=c++17 *.o -o hello
```
%% Cell type:markdown id: tags:
The issue with that is that it's not robust at all: either you recompile everything all the time (and let's face it: it's tedious even with our limited number of files...) or you have to keep track of which should be recompiled. For instance if `who-are-you.hpp` is modified all source files include it and must be recompiled, but if it is `hello.hpp` `who_are_you.cpp` is not modified.
It is to handle automatically this and limit the compilation to only what is required that build systems (which we talked about briefly [here](/notebooks/6-InRealEnvironment/1-SetUpEnvironment.ipynb#Build-system)) were introduced. Let's see a brief CMake configuration file named by convention `CMakeLists.txt`:
%% Cell type:code id: tags:
``` C++17
// CMakeLists.txt
set(CMAKE_CXX_STANDARD 17 CACHE STRING "C++ standard; at least 17 is expected.")
add_executable(hello
main.cpp
hello.cpp
who-are-you.cpp)
```
%% Cell type:code id: tags:
``` C++17
// In terminal
mkdir build // create a directory to separate build from source files and so on
cd build
cmake .. // will create the Makefile; as no generator was provided with -G Unix makefile is chosen.
make
```
%% Cell type:markdown id: tags:
This command creates the executable in current directory; now if we modified one file the build system will rebuild all that needs it and nothing more.
%% Cell type:markdown id: tags:
If `main.cpp` and `hello.cpp` may also be used jointly for another executable, they may be put together in a library:
%% Cell type:code id: tags:
``` C++17
set(CMAKE_CXX_STANDARD 17 CACHE STRING "C++ standard; at least 17 is expected.")
add_library(hello_lib
SHARED
hello.cpp
who-are-you.cpp)
add_executable(hello
main.cpp)
target_link_libraries(hello
hello_lib)
```
%% Cell type:markdown id: tags:
SHARED may be replaced by STATIC to use a static library instead.
%% Cell type:markdown id: tags:
## Where should the headers be included?
* Each time a header is modified, all the source files that include it directly or indirectly are recompiled.
* Each time a source file is modified, only this source file is modified; some relinking for the libraries and executables that depend on it will also occur (linking is the step that glue together the object files and libraries; the term _compilation_ is often - included in this very tutorial - abusively used to encompass both compilation and link phases).
Thus it might seem a good idea to put as much as possible `#include` directives in the source files... hence limiting the compilation time. This is a generally very good advice... provided we do not err on the wrong side and put enough in the header file:
%% Cell type:code id: tags:
``` C++17
// File foo.hpp
#ifndef FOO_HPP
# define FOO_HPP
#include <string>
void Print(std::string text);
#endif // FOO_HPP
```
%% Cell type:code id: tags:
``` C++17
// File foo.cpp
#include <iostream>
#include "foo.hpp"
void Print(std::string text)
{
std::cout << "The text to be printed is: \"" << text << "\"." << std::endl;
}
```
%% Cell type:code id: tags:
``` C++17
// File main.cpp
#include <cstdlib>
#include "foo.hpp"
int main()
{
Print("Hello world!");
return EXIT_SUCCESS;
}
```
%% Cell type:markdown id: tags:
You may have noticed `string` and `iostream` are not deal with the same way... and rightly so:
* `#include <iostream>` is only in the source file: it is actually needed only for `std::cout` and `std::endl`, which are implementation details of `Print()` function: neither appears in the signature of the function.
* `#include <string>` is present in `foo.hpp` as it is required to give the information about the type of the prototype to be used. If you do not do that, each time you include `foo.hpp` you would need to include as well `string`; doing so leads to unmaintainable code as you would have to track down all the includes that are required with each include...
So to put in a nutshell:
* Put in the header files all the includes that are mandatory to make the prototypes understandable. A rule of thumb is that a source file that would only include the header file should be compilable:
%% Cell type:code id: tags:
``` C++17
// File foo.hpp
std::string Print();
```
%% Cell type:code id: tags:
``` C++17
// File check_foo.hpp
#include "foo.hpp" // DOES NOT COMPILE => header is ill-formed!
```
%% Cell type:markdown id: tags:
* Include that are here for implementation details should on the other hand be preferrably in source files. Of course, you may not be able to do that in any case: for instance templates are by construction defined in header files!
%% Cell type:markdown id: tags:
Static code checker [cpplint](https://github.com/cpplint/cpplint) provides a warning _include what you use_ which will tell you if you're using in a file a type which is not included there.
%% Cell type:markdown id: tags:
## Forward declaration