Aggregate database measures to increase reliability of results

With #3 (closed), we will cache benchmarks' IPC in a database for future use.

While doing that, we might as well require the measure to be made twice/thrice/something else. Then, while fetching the cached result,

either we don't have enough datapoints yet, and we make another measure and commit it to database;
or we have enough measures, which we aggregate in a single measure

The major perk would be that we could check the variance of the experiment, and decide whether those measures were reliable.

To upload designs, you'll need to enable LFS and have an admin enable hashed storage. More information

Admin message