Fix task pruning on WAR dependencies between nodes
As we discussed about the TPDS paper, we don't have WAR dependencies between nodes yet, but that could happen, we need to check against that too.
Merge request reports
Activity
@thibault Can you give an example of possible errors due to the way WAR are handled please.
Assuming that data A has owner node 0, and data B and C have owner node 1:
starpu_mpi_insert_task(&foo, STARPU_W, A, 0);
starpu_mpi_insert_task(&bar, STARPU_R, A, STARPU_W, B, 0);
starpu_mpi_insert_task(&foo, STARPU_W, A, 0);
starpu_mpi_insert_task(&bar, STARPU_R, A, STARPU_W, C, 0);
1st and 3rd tasks will thus run on node 0, and 2nd and 4th will run on node 1.
If the insert_task wrapper for foo only looks at the owner of A, it will prune the 3rd task insertion on node 1. And thus the 4th task insertion on node 1 will think that it still has the right value for A, and not receive it. In the meanwhile, node 0 will send the new value of A (since it knows it is outdated), and thus we have a mismatch.
Edited by THIBAULT SamuelNow, to make the code less hairy, I was thinking that we could have helpers to write task insertion wrappers this way:
foo() { BEGIN_ACCESS ACCESS_R(A) ACCESS_W(B) END_ACCESS task_insert(foo, STARPU_R, A, STARPU_W, B, 0); }
and ACCESS_R/W will do whatever accounting is needed, and END_ACCESS will potentially return if it is known that none of the data is owned by the node, and none of them is cached.
Edited by THIBAULT Samuel- Regarding your example:
I am certainly missing the point but isn't the problem more general than WAR? Said differently, is the problem only for WAR or is it potentially the case for WAW too?
In your example, what happens if the 2nd task is in write mode instead: starpu_mpi_insert_task(&bar, STARPU_W, A, STARPU_W, B, 0)? In this case we can even remove the 1st task. Wouldn't it lead to the same problem? The third task would not be pruned on node 1 as well?
-
If a helper as you suggest can fit, that would indeed help a lot the readability. If it can furthermore optionally to the stats, it would be excellent.
-
Overall, although a trivial conceptual problem, I think it is easy to get lost (I am already lost with a 4 tasks example :-/ ). My feeling is that if we want to design advanced numerical algorithms (with bulge-chasing for instance, ...) and/or innovative data mappings, this checks will be a precious time saving.
See !27 (merged)
mentioned in merge request !27 (merged)