Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • S starpu
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 12
    • Issues 12
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 6
    • Merge requests 6
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • starpu
  • starpu
  • Merge requests
  • !66

Replicated tasks

  • Review changes

  • Download
  • Email patches
  • Plain diff
Open Antoine Jego requested to merge replicated_tasks into master Sep 29, 2022
  • Overview 53
  • Commits 72
  • Pipelines 38
  • Changes 24

This MR aims at adding functionalities to StarPU-MPI so that a task can be replicated across several nodes, and the results of replicated tasks used by other nodes.

As of creating this MR, it is implemented the following way, following discussions with the StarPU team:

  • A node n that is not the home_node h of a handle D can execute a task without sending the result back to h by using the STARPU_SAME flag ; this sets the cache for D as if n received a coherent value : we expect the application to have h executing the same task (h should be the last of the replicating nodes to insert the task because of sequential consistency).
  • In order for this data to be used, i.e. for $n$ to be a source of D, other nodes should set $n$ as a source for $d$ through set_source function.

To-do list:

  • examples
    • (re)set_source
    • use starpu_same
    • combine both
  • tests (assuming H is owned by P0)
    • P1 becomes an alternative source of H for P2 then P0 writes H : P1 and P2 should receive H from P0 if they read it
    • P2 reads H then P1 becomes an alternative source of H for P2 then P2 reads H : either P2 forgets it has a valid value of P0 and the second read happens or P1 receives the cache state of P0 to avoid the unnecessary send
    • P1 becomes an alternative source of H for P2 then P2 read H without P0 knowing then P2 resets their alternative source before reading H : either P2 receives H from P0 (reset = flush) or P1 sends its cache to P0 (reset = exchange)
  • fortran interface
  • docs
Edited Dec 14, 2022 by Antoine Jego
Assignee
Assign to
Reviewers
Request review from
Time tracking
Source branch: replicated_tasks