Commit 78c5f854 authored by GILLES Sebastien's avatar GILLES Sebastien
Browse files

- #22 Fix the `std::unique` example which was broken.

- #21: a basic C++ 20 example for ranges has been crafted instead of lifted from a blog post.
parent 69c20283
...@@ -354,11 +354,13 @@ ...@@ -354,11 +354,13 @@
"#include <iostream>\n", "#include <iostream>\n",
"\n", "\n",
"{\n", "{\n",
" std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n", " std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 };\n",
" \n",
" std::cout << \"Initially there are \" << int_vec.size() << \" values.\" << std::endl;\n",
" \n", " \n",
" std::unique(int_vec.begin(), int_vec.end());\n", " std::unique(int_vec.begin(), int_vec.end());\n",
" \n", " \n",
" std::cout << \"The unique values are (or not...): \";\n", " std::cout << \"The \" << int_vec.size() << \" unique values are (or not...): \";\n",
" \n", " \n",
" for (auto item : int_vec)\n", " for (auto item : int_vec)\n",
" std::cout << item << \" \";\n", " std::cout << item << \" \";\n",
...@@ -388,11 +390,13 @@ ...@@ -388,11 +390,13 @@
"\n", "\n",
"{\n", "{\n",
" std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n", " std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n",
" \n",
" std::cout << \"Initially there are \" << int_vec.size() << \" values.\" << std::endl; \n",
" \n", " \n",
" std::sort(int_vec.begin(), int_vec.end());\n", " std::sort(int_vec.begin(), int_vec.end());\n",
" std::unique(int_vec.begin(), int_vec.end());\n", " std::unique(int_vec.begin(), int_vec.end());\n",
" \n", " \n",
" std::cout << \"The unique values are (really this time): \";\n", " std::cout << \"The \" << int_vec.size() << \" unique values are (really this time?): \";\n",
" \n", " \n",
" for (auto item : int_vec)\n", " for (auto item : int_vec)\n",
" std::cout << item << \" \";\n", " std::cout << item << \" \";\n",
...@@ -405,14 +409,25 @@ ...@@ -405,14 +409,25 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"Personally I have in my Utilities library a function `EliminateDuplicate()` which calls both in a row." "You may have noticed here that we're not done yet, but on second sight it is nonetheless better:\n",
"\n",
"- The list has been sorted properly (except the two elements).\n",
"- Most of the duplicates have been properly removed (-21 for instance...) But there are new ones? ('87' and '100'... which are the last two elements?)\n"
] ]
}, },
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"### std::remove_if" "It seems that basically the 7 first elements provide the behaviour you sought (provided of course you didn't mind the ordering - if you do `std::unique` is not the answer to your requirement) and the 2 last elements are garbage...\n",
"\n",
"In fact, looking into the definition of `std::unique` we neglected the return type of the function, which is an iterator to the **logical end** of the expected series...\n",
"\n",
"Algorithms are written to be **as generic as possible**, and thus can't do some operations such as allocating memory (seen earlier in this notebook with `std::back_inserter`) or deallocating some.\n",
"\n",
"To achieve what you seek you have basically two options:\n",
"\n",
"1. Do not modify the vector but leverage the iterator information you have:\n"
] ]
}, },
{ {
...@@ -427,17 +442,16 @@ ...@@ -427,17 +442,16 @@
"\n", "\n",
"{\n", "{\n",
" std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n", " std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n",
" \n",
" std::remove_if(int_vec.begin(), int_vec.end(),\n",
" [](int value)\n",
" { \n",
" return value % 2 != 0;\n",
" });\n",
" \n", " \n",
" std::cout << \"The even values are (or not...): \";\n", " std::cout << \"Initially there are \" << int_vec.size() << \" values.\" << std::endl; \n",
" \n",
" std::sort(int_vec.begin(), int_vec.end());\n",
" auto logical_end = std::unique(int_vec.begin(), int_vec.end());\n",
" \n",
" std::cout << \"The \" << logical_end - int_vec.begin() << \" unique values are: \"; // size() is still 9!\n",
" \n", " \n",
" for (auto item : int_vec)\n", " for (auto it = int_vec.cbegin(); it != logical_end; ++it)\n",
" std::cout << item << \" \";\n", " std::cout << *it << \" \";\n",
" \n", " \n",
" std::cout << std::endl;\n", " std::cout << std::endl;\n",
"}" "}"
...@@ -447,15 +461,7 @@ ...@@ -447,15 +461,7 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"So what happens this time? [cplusplus.com](http://www.cplusplus.com/reference/algorithm/remove_if/) tells that it _Transforms the range \\[first,last) into a range with all the elements for which pred returns true removed, and returns an iterator to the new end of that range_.\n", "2. Or use the container method to erase properly the unneeded elements:"
"\n",
"In other words, `std::remove_if`:\n",
"\n",
"* Place at the beginning of the vector the values to be kept.\n",
"* Returns an iterator to the **logical end** of the expected series...\n",
"* But does not deallocate the memory! (and keeps the container's `size()` - see below)\n",
"\n",
"So to print the relevant values only, you should do:\n"
] ]
}, },
{ {
...@@ -470,20 +476,19 @@ ...@@ -470,20 +476,19 @@
"\n", "\n",
"{\n", "{\n",
" std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n", " std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n",
" \n",
" auto logical_end = std::remove_if(int_vec.begin(), int_vec.end(),\n",
" [](int value)\n",
" { \n",
" return value % 2 != 0;\n",
" });\n",
" \n", " \n",
" std::cout << \"The even values are: \";\n", " std::cout << \"Initially there are \" << int_vec.size() << \" values.\" << std::endl; \n",
" \n",
" std::sort(int_vec.begin(), int_vec.end());\n",
" auto logical_end = std::unique(int_vec.begin(), int_vec.end());\n",
" int_vec.erase(logical_end, int_vec.end()); // Here really deallocate the last elements. \n",
" \n",
" std::cout << \"The \" << int_vec.size() << \" unique values are (really this time!): \"; // size() may therefore be used.\n",
" \n", " \n",
" for (auto it = int_vec.cbegin(); it != logical_end; ++it)\n", " for (auto item : int_vec)\n",
" std::cout << *it << \" \";\n", " std::cout << item << \" \";\n",
" \n", " \n",
" std::cout << std::endl;\n", " std::cout << std::endl;\n",
" std::cout << \"But the size of the vector is still \" << int_vec.size() << std::endl;\n",
"}" "}"
] ]
}, },
...@@ -491,7 +496,14 @@ ...@@ -491,7 +496,14 @@
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
"source": [ "source": [
"And if you want to reduce this size, you should use the `std::vector::erase()` method:" "It should be noticed that there are no requirements in the standard on the values past the logical end: we could have ended up with other values than '87' and '100'. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Of course, you may encapsulate the second possibility in a custom-made function if you need it often (of course document it properly and specify it shuffles the elements in the container):"
] ]
}, },
{ {
...@@ -502,29 +514,81 @@ ...@@ -502,29 +514,81 @@
"source": [ "source": [
"#include <vector>\n", "#include <vector>\n",
"#include <algorithm>\n", "#include <algorithm>\n",
"#include <iostream>\n",
"\n", "\n",
"template<class T>\n",
"void EliminateDuplicate(std::vector<T>& vector)\n",
"{\n",
" std::sort(vector.begin(), vector.end());\n",
" vector.erase(std::unique(vector.begin(), vector.end()),\n",
" vector.end());\n",
"}"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"{\n", "{\n",
" std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n", " std::vector<int> int_vec { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n",
" \n",
" auto logical_end = std::remove_if(int_vec.begin(), int_vec.end(),\n",
" [](int value)\n",
" { \n",
" return value % 2 != 0;\n",
" });\n",
" \n",
" int_vec.erase(logical_end, int_vec.end());\n",
" \n", " \n",
" std::cout << \"The even values are: \";\n", " std::cout << \"Initially there are \" << int_vec.size() << \" values.\" << std::endl; \n",
" \n",
" EliminateDuplicate(int_vec); \n",
" \n",
" std::cout << \"The \" << int_vec.size() << \" unique values are (really this time!): \";\n",
" \n", " \n",
" for (auto item : int_vec)\n", " for (auto item : int_vec)\n",
" std::cout << item << \" \";\n", " std::cout << item << \" \";\n",
" \n", " \n",
" std::cout << std::endl;\n", " std::cout << std::endl;\n",
" std::cout << \"And the size of the vector is correctly \" << int_vec.size() << std::endl;\n",
"}" "}"
] ]
}, },
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### std::remove_if"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The situation is exactly the same for [`std::remove_if`](http://www.cplusplus.com/reference/algorithm/remove_if/), which despite its name can't do it for the same reasons. So you also need to take care of the deallocation through the dedicated container method:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"#include <list>\n",
"#include <algorithm>\n",
"#include <iostream>\n",
"\n",
"\n",
"std::list<int> list { -9, 87, 11, 0, -21, 100, -21, 17, -21 }; \n",
"\n",
"auto logical_end = std::remove_if(list.begin(), list.end(),\n",
" [](int value)\n",
" { \n",
" return value % 2 != 0;\n",
" });\n",
"\n",
"list.erase(logical_end, list.end());\n",
"\n",
"std::cout << \"The even values are: \";\n",
"\n",
"for (auto item : list)\n",
" std::cout << item << \" \";\n",
"\n",
"std::cout << std::endl;"
]
},
{ {
"cell_type": "markdown", "cell_type": "markdown",
"metadata": {}, "metadata": {},
...@@ -541,7 +605,7 @@ ...@@ -541,7 +605,7 @@
"\n", "\n",
"It is also important to highlight that while the STL algorithms may provide you efficiency (this library is written by highly skilled engineers after all), this is not its main draw: the algorithms are written to be as generic as possible. The primary reason to use them is to allow you to think at a higher level of abstraction, not to get the fastest possible implementation. So if your ~~intuition~~ benchmarking has shown that the standard library is causing a critical slowdown, you are free to explore classic alternatives such as [loop unrolling](https://en.wikipedia.org/wiki/Loop_unrolling) - that's one of the strength of the language (and the STL itself opens up this possibility directly for some of its construct - you may for instance use your own memory allocator when defining a container). For most purposes however that will not be necessary.\n", "It is also important to highlight that while the STL algorithms may provide you efficiency (this library is written by highly skilled engineers after all), this is not its main draw: the algorithms are written to be as generic as possible. The primary reason to use them is to allow you to think at a higher level of abstraction, not to get the fastest possible implementation. So if your ~~intuition~~ benchmarking has shown that the standard library is causing a critical slowdown, you are free to explore classic alternatives such as [loop unrolling](https://en.wikipedia.org/wiki/Loop_unrolling) - that's one of the strength of the language (and the STL itself opens up this possibility directly for some of its construct - you may for instance use your own memory allocator when defining a container). For most purposes however that will not be necessary.\n",
"\n", "\n",
"FYI, C++ 20 introduces a completely new way to deal with algorithms, which does not rely on direct use of iterators but instead on a range library. This leads to a syntax which is more akin to what is done in other languages - see for instance this example lifted from this [blog post](https://www.modernescpp.com/index.php/c-20-the-ranges-library):\n" "FYI, C++ 20 introduces a completely new way to deal with algorithms, which does not rely on direct use of iterators but instead on a range library. This leads to a syntax which is more akin to what is done in other languages (see [@ Coliru](https://coliru.stacked-crooked.com/a/23b256ac459633ff)):\n"
] ]
}, },
{ {
...@@ -559,12 +623,13 @@ ...@@ -559,12 +623,13 @@
"int main(int argc, char** argv) \n", "int main(int argc, char** argv) \n",
"{\n", "{\n",
"\n", "\n",
" std::vector<int> numbers = {1, 2, 3, 4, 5, 6};\n", " std::vector<int> pi_digits = { 3, 1, 4, 1, 5, 9};\n",
" \n", " \n",
" auto results = numbers | std::views::filter([](int n){ return n % 2 == 0; })\n", " auto results = pi_digits | std::views::filter([](int n){ return n % 2 != 0; })\n",
" | std::views::transform([](int n){ return n * 2; });\n", " | std::views::reverse;\n",
" \n", " \n",
" for (auto v: results) std::cout << v << \" \"; // 4 8 12\n", " for (auto v: results) \n",
" std::cout << v << \" \"; // 9 5 1 1 3\n",
"\n", "\n",
" return EXIT_SUCCESS;\n", " return EXIT_SUCCESS;\n",
"}" "}"
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment