Commit 7d5c47e5 authored by THIBAULT Samuel's avatar THIBAULT Samuel
Browse files

document on static scheduling

parent e836a240
......@@ -150,9 +150,184 @@ done ) | sort
<div class="section">
<h3>How to generate static scheduling</h3>
<p>The examples above were using the StarPU dynamic schedulers. One can inject
static scheduling by adding a <tt>sched.rec</tt> file into the play.</p>
<p>The <tt>tasks.rec</tt> file is following the recutils format: some paragraphs
are separated by an empty line. Each paragraph represents a task to be executed,
with a lot of information, some of which is coming from the native execution
that was performed when recording the trace:
<li>The <tt>Model</tt> field identifies the performance model to be used
(see more in the next paragraph).</li>
<li>The <tt>JobId</tt> field uniquely identifies the task.</li>
<li>The <tt>SubmitOrder</tt> field also uniquely identifies the task,
but according to task submission, which is thus stable.</li>
<li>The <tt>DependsOn</tt> field provides the list of the
identifiers of the tasks that this task depends on.</li>
<li>The <tt>Priority</tt> field provides a priority as set by the
application (higher is more urging).</li>
<li>The <tt>WorkerId</tt> field provides the worker on which the task
was executed when the trace was recorded.</li>
<li>The <tt>MemoryNode</tt> field provides the corresponding memory node
on which the task was executed.</li>
<li>The <tt>SubmitTime</tt> field provides the time when the task was
submitted by the application. The scheduler usually does not
care about this.</li>
<li>The <tt>StartTime</tt> and <tt>EndTime</tt> fields provides the time
when the task was started and finished.</li>
<li>The <tt>GFlop</tt> field provides the number of billions of floating-point
operations performed by the task.</li>
<li>The <tt>Parameters</tt> field provides a description of the task
<li>The <tt>Handles</tt> field provides the pointers of the task
parameters. These can be used to relate data input and output of
<li>The <tt>Modes</tt> field provides the access mode of the task
parameters: (R)ead-only, (R)ead-and-(W)rite, or
<li>The <tt>Sizes</tt> field provides the size of the task parameters,
in bytes.</li>
<p>The performance of tasks on the different execution units can be obtained by
running <tt>starpu_perfmodel_recdump</tt>:
$ STARPU_HOSTNAME=mirage STARPU_PERF_MODEL_DIR=$PWD/sampling starpu_perfmodel_recdump
which first emits in a <tt>%rec: timing</tt> section a series of paragraphs,
one per set of measurements made for the same kind of task on the same data
size. Each paragraph contains:
<li>The <tt>Name</tt> field which is the name of the performance model,
as referenced by the <tt>Model</tt> field in a task
<li>The <tt>Architecture</tt> field describes the architecture on which
the set of measurement was made</li>
<li>The <tt>Footprint</tt> field describes the data description
footprint, as referenced by the <tt>Footprint</tt> field in a
task paragraph. It is roughly a summary of the task parameters'
<li>The <tt>Size</tt> field provides the total task parameters' size in
<li>The <tt>Flops</tt> field provides the number of floating-point
operations that were performed by the task.</li>
<li>The <tt>Mean</tt> field provides the average of the measurements in
the set.</li>
<li>The <tt>Stddev</tt> field provides the standard deviation of the
measurements in the set.</li>
<li>The <tt>Samples</tt> field provides the number of measurements that
were made.</li>
Then the <tt>%rec: worker_count</tt> section describes the target platform, with
one paragraph per kind of execution unit:
<li>The <tt>Architecture</tt> field provides the name of the type of
execution unit, as referenced in the <tt>Architecture</tt> field
of the paragraphs mentioned above.</li>
<li>The <tt>NbWorkers</tt> field provides the number of workers of this
Then the <tt>%rec: memory_workers</tt> section describes the memory layout of
the target platform, with one paragraph per memory node:
<li>The <tt>MemoryNode</tt> field provides the memory node number.</li>
<li>The <tt>Name</tt> field provides a user-friendly name for the memory
<li>The <tt>Size</tt> field provide the amount of available space in the
memory node (-1 if it is considered unbound).</li>
<li>The <tt>Workers</tt> field provides the list of worker IDs using
the memory node.</li>
Workers IDs are numbered starting from 0 and according to the order of the
paragraphs in the <tt>%rec: worker_count</tt> section.
A static schedule can then be expressed by producing a <tt>sched.rec</tt> file
containing one paragraph per task. Each of them must contain a
<tt>SubmitOrder</tt> field containing the submission identifier (as referenced
in the <tt>SubmitOrder</tt> field of <tt>tasks.rec</tt>). The reason why the
<tt>JobId</tt> is not used is because StarPU may generate internal tasks, which
will change job ids. The <tt>SubmitOrder</tt>, on the contrary, only depends on
the application submission loop, and is thus completely stable, making it even
possible to inject the static scheduling in a native execution with the real
The paragraph can then contain several kinds of scheduling directives, either to
force task placement for instance, or to guide the StarPU dynamic scheduler:
<li><tt>Priority</tt> will override the application-provided priority,
and possibly be taken into account by a StarPU dynamic
<li><tt>SpecificWorker</tt> specifies the worker ID on which this task will
be executed.</li>
<li><tt>Workers</tt> provides a list of workers that the task will be
allowed to execute on. This thus allows to restrict the
execution location, without necessarily deciding it
completely, e.g. to specify a given memory node or worker type.</li>
<li><tt>DependsOn</tt> provides a list of submission IDs of tasks that
this task should be made to depend on, in addition to the
dependencies set in <tt>tasks.rec</tt>. This thus allows to
inject artificial dependencies in the task graph.</li>
<li><tt>Workerorder</tt> allows to force a specific ordering of tasks on
a given worker (whose ID must be also set with
<tt>SpecificWorker</tt>). The tasks will be executed in the
provided contiguous order, i.e. the worker will wait for task
with workerorder 0 to be submitted, then execute it, then wait
for task with workerorder 1 to be submitted, then execute it,
For instance, a completely static schedule can be set by setting, for each task,
both the <tt>SpecificWorker</tt> and the <tt>Workerorder</tt> field, thus
respectively specifying for each task on which worker it shall run, and its
ordering on that worker.
When the <tt>SpecificWorker</tt> field is set for a task, or its
<tt>Workers</tt> field corresponds to only one memory node, StarPU will
automatically prefetch the data during execution. One can however also set
prefetches by hand in <tt>sched.rec</tt> by using a paragraph containing:
<li>A <tt>Prefetch</tt> field which specifies the submission ID of the
task for which data should be prefetched.</li>
<li>A <tt>MemoryNode</tt> field which specifies the memory node on which
data should be prefetched.</li>
<li>A <tt>Parameters</tt> field which specifies the indexes of the
task parameters that should be prefetched (currently only one at
a time is supported).</li>
<li>An optional <tt>DependsOn</tt> field to make this prefetch wait for
tasks, whose submission IDs are provided.</li>
This for instance allows not to specify precise task scheduling hints, but
provide data prefetch hints which will probably guide the scheduler into a given
data placement.
<div class="section bot">
<p class="updated">
Last updated on 2019/09/06.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment