Fix the test scripts (!38) · Merge requests · POTTIER Francois / inferno

SCHERER Gabriel requested to merge gscherer/inferno:better-tests into master Feb 01, 2022

This MR fixes two bugs that were introduced recently in the test scripts.

The random testsuite would behave unexpectedly on certain bugs, aborting with incomplete output and no clear error report. This is due to the use of exit 1 in FTypeCheckerWrapper that prevents the testing framework from catching the error and behaving correctly. Instead we re-raise the exception.
make quick, which was introduced to avoid 20s of testing time, would incorrectly not run the cram tests (see client/test/suite.t/run.t). We fix this by re-introducing the INFERNO_SLOW_TESTS option and its use in TestMLRandom. The make test and make quick targets were kept and work as intended, and we changed the default setting to run slow tests when INFERNO_SLOW_TESTS is not set (@fpottier: we assume that this is your preference, and the reason why you ditched that code in the first place). We checked that the "fast tests" part of TestMLRandom is still effective at finding silly bugs in Infer.ml, so having it is much better than skipping this test entirely.

On FTypecheckerWrapper

The situation before this PR is completely broken. Even with the PR, the situation is only semi-satisfying, I think. It's not nice that FTypecheckerWrapper re-raises an exception that does not appear in its module interface, but in some other module.

I propose to get rid of FTypecheckerWrapper, and have a typeof variant in FTypechecker that prints an error message in the error case. Then we need to decide which interface to give to this function:

we could keep raising the Type_error exception, and let downstream users deal with that; in the common case, doing nothing is the correct thing for them
or we could re-raise a simpler exception with less information; I see no benefit in that approach
or we could not raise, and instead return a (debruijn_type, type_error) result

If we go for (1), you may still want to handle the exception on each callsite, as a sort of programming discipline. If that is your preference, I think that (3) would actually be the better API.

On `make quick`

Here would be my recommended testing policy:

Keep dune runtest as the main test driver. Some control of which tests are run is possible (dune runtest client/test does something), even if badly documented. Do not try to execute programs in _build directly. We don't use the make targets in practice (in particular because we don't need the --force behavior which increases testing time without a strong reason), it should be possible to do the right thing with invocations of dune runtest. (But of course it's fine if you (@fpottier) like the Makefile target better, we keep them working as they should.)
Have each test program in charge of deciding which tests to run in "fast" mode (INFERNO_SLOW_TESTS is 0), so that we can systematically run them all and get the expected behavior.

(If you don't like using an environment variable, we can discuss other control interfaces such as command-line flags.)

Edited Feb 01, 2022 by SCHERER Gabriel

Admin message

Fix the test scripts

On FTypecheckerWrapper

On make quick

Merge request reports

On `make quick`