Debug non-terminating Melissa runs
The dummy simulation tests are timing out in increasingly higher frequency (cf. issue #71). The goal of this issue is to preclude that @cconrads changes in !29 are responsible for this.
In the list below, a commit is called new if at least as recent as commit 46234bbe or later (i.e., when !29 was merged); a commit is called old if it is not new (i.e., from before !29 was merged). There were multiple option.py
files present for the heat PDE example before !29 was merged and they asked Melissa to compute different things, e.g., not the Sobol' indices. This issue assumes that the options from examples/heat_example/local/options.py.in
is used. The dummy simulation refers to the code in tests/dummy-simulation.c
.
Edit:
Check for termination in the following cases
-
Melissa server, launcher from new commit; Melissa client from old commit
- build Melissa Server, Launcher from a new commit
- build Melissa client from before from an old commit
- run the heat PDE example [TERMINATING]
- run the dummy simulation [NOT TERMINATING]
- Reasoning: The changes made to the launcher should not affect the experiment outcome.
-
Melissa server, launcher from old commit; Melissa client from new commit
- run the heat PDE example [TERMINATING]
-
run the dummy simulation [TERMINATING but all non-root MPI jobs crash in
src/api/melissa_api.c
with a segmentation fault] - Reasoning: The changes made to the source files including the former
api/melissa_api.c
should not affect the experiment outcome.
-
Melissa server, launcher, and client from old commit
-
run the heat PDE example (tests
TestHeatc1
,TestHeatc2
) [TERMINATING] - Reasoning: determine problems caused by the computer running the experiments
-
run the heat PDE example (tests
-
Modify options
- The dummy simulation test and the heat PDE example compute the same statistic but the dummy simulation
- computes only two time steps instead of 100,
- computes only a field of dimension two.
- Reasoning: @cconrads can make the dummy simulation test pass reliably on his computer by inserting sleep statements but this approach does not work with the heat PDE. Obviously the options must make a difference.
- The dummy simulation test and the heat PDE example compute the same statistic but the dummy simulation
Further possibilities:
- Attaching valgrind to the server and/or client