Commit d06da593 authored by Nathalie Furmento's avatar Nathalie Furmento
Browse files

Merge branch 'master' into llvm-openmp

parents ca45de6c 6920a593
......@@ -31,6 +31,7 @@ Namyst Raymond, Université de Bordeaux, <raymond.namyst@labri.fr>
Nesi Lucas Leandro, Federal University of Rio Grande do Sul (UFRGS), <llnesi@inf.ufrgs.br>
Pablo Joris, Inria, <joris.pablo@orange.fr>
Pasqualinotto Damien, Université de Bordeaux, <dam.pasqualinotto@wanadoo.fr>
Pinto Vinicius Garcia, <vgpinto@inf.ufrgs.br>
Pitoiset Samuel, Inria, <samuel.pitoiset@inria.fr>
Quôc-Dinh Nguyen, IT Sud-Paris, <nguyen.quocdinh@gmail.com>
Roelandt Cyril, Inria, <cyril.roelandt@inria.fr>
......
......@@ -86,6 +86,7 @@ Small features:
* Add STARPU_SCHED_SORTED_ABOVE and STARPU_SCHED_SORTED_BELOW environment
variables.
* Add STARPU_SCHED_SIMPLE_PRE_DECISION.
* Add starpu_bcsr_filter_canonical_block_get_nchildren.
StarPU 1.3.7
====================================================================
......
......@@ -753,66 +753,90 @@ to a less optimal solution. This increases even more computation time.
\section starvz Trace visualization with StarVZ
Creating views with StarVZ (see: https://github.com/schnorr/starvz) is made up of two steps. The initial
stage consists of a pre-processing of the traces generated by the application.
The second step consists of the analysis itself and is carried out with the
aid of R packages. To download and install StarVZ, it is necessary to have R,
pajeng and the following packages:
Creating views with StarVZ (see: https://github.com/schnorr/starvz) is
made up of two steps. The initial stage consists of a pre-processing
of the traces generated by the application, while the second one
consists of the analysis itself and is carried out with R packages'
aid. StarVZ is available at CRAN
(https://cran.r-project.org/package=starvz) and depends on pj_dump
(from pajeng) and rec2csv (from recutils).
To download and install StarVZ, it is necessary to have R,
pajeng, and recutils:
\verbatim
# For pajeng
apt install -y git cmake build-essential libboost-dev asciidoc flex bison
git clone git://github.com/schnorr/pajeng.git
mkdir -p pajeng/b ; cd pajeng/b
cmake ..
make
# For pj_dump and rec2csv
apt install -y pajeng recutils
# For R tidyverse
# For R
apt install -y r-base libxml2-dev libssl-dev libcurl4-openssl-dev libgit2-dev libboost-dev
\endverbatim
To install the StarVZ the following commands can be used:
To install the StarVZ, the following command can be used:
\verbatim
git clone https://github.com/schnorr/starvz.git
echo "install.packages(c('tidyverse', 'devtools'), repos = 'https://cloud.r-project.org')" | R --vanilla
echo "library(devtools); devtools::install_local(path='./starvz/R_package')" | R --vanilla
echo "install.packages('starvz', repos = 'https://cloud.r-project.org')" | R --vanilla
\endverbatim
To generate traces from an application, it is necessary to set \ref STARPU_GENERATE_TRACE.
and build StarPU with FxT. Then, Step 1 of StarVZ can be used on a folder with
StarPU FxT traces:
To generate traces from an application, it is necessary to set \ref STARPU_GENERATE_TRACE
and build StarPU with FxT. Then, StarVZ can be used on a folder with
StarPU FxT traces to produce a default view:
\verbatim
export PATH=$(Rscript -e 'cat(system.file("tools/", package = "starvz"), sep="\n")'):$PATH
starvz /foo/path-to-fxt-files
\endverbatim
An example of default view:
\image html starvz_visu.png
\image latex starvz_visu.pdf "" width=\textwidth
One can also use existing trace files (paje.trace, tasks.rec,
data.rec, papi.rec and dag.dot) skipping the StarVZ internal call to
starpu_fxt_tool with:
\verbatim
export PATH=starvz/:$PATH
export PATH=pajeng/b:$PATH
export PATH=$STARPU_HOME/bin:$PATH
starvz --use-paje-trace /foo/path-to-trace-files
\endverbatim
./starvz/src/phase1-workflow.sh /tmp/ ""
Alternatively, each StarVZ step can be executed separately. Step 1 can
be used on a folder with:
\verbatim
starvz -1 /foo/path-to-fxt-files
\endverbatim
Then the second step can be executed directly in R, StarVZ enables a set of
different plots that can be configured on a .yaml file. A default file is provided
<c>full_config.yaml</c>; also the options can be changed directly in R.
Then the second step can be
executed directly in R. StarVZ enables a set of different plots that
can be configured on a .yaml file. A default file is provided
(<c>default.yaml</c>); also, the options can be changed directly in
R.
\verbatim
library(starvz)
dtrace <- the_fast_reader_function("./")
library(dplyr)
dtrace <- starvz_read("./", selective = FALSE)
pajer <- config::get(file = "starvz/full_config.yaml")
# show idleness ratio
dtrace$config$st$idleness = TRUE
pajer$starpu$active = TRUE
pajer$submitted$active = TRUE
pajer$st$abe$active = TRUE
# show ABE bound
dtrace$config$st$abe$active = TRUE
plot <- the_master_function(dtrace)
# find the last task with dplyr
dtrace$config$st$tasks$list = dtrace$Application %>% filter(End == max(End)) %>% .$JobId
# show last task dependencies
dtrace$config$st$tasks$active = TRUE
dtrace$config$st$tasks$levels = 50
plot <- starvz_plot(dtrace)
\endverbatim
An example of visualization follows:
\image html starvz_visu.png
\image latex starvz_visu.eps "" width=\textwidth
\image html starvz_visu_r.png
\image latex starvz_visu_r.pdf "" width=\textwidth
\section MemoryFeedback Memory Feedback
......
......@@ -470,7 +470,7 @@ starpu_mpi_barrier(MPI_COMM_WORLD);
\section MPIInsertTaskUtility MPI Insert Task Utility
To save the programmer from having to explicit all communications, StarPU
provides an "MPI Insert Task Utility". The principe is that the application
provides an "MPI Insert Task Utility". The principle is that the application
decides a distribution of the data over the MPI nodes by allocating it and
notifying StarPU of this decision, i.e. tell StarPU which MPI node "owns"
which data. It also decides, for each handle, an MPI tag which will be used to
......@@ -571,7 +571,7 @@ to provide a dynamic policy.
A function starpu_mpi_task_build() is also provided with the aim to
only construct the task structure. All MPI nodes need to call the
function, which posts the required send/recv on the various nodes which have to.
function, which posts the required send/recv on the various nodes as needed.
Only the node which is to execute the task will then return a
valid task structure, others will return <c>NULL</c>. This node must submit the task.
All nodes then need to call the function starpu_mpi_task_post_build() -- with the same
......@@ -637,7 +637,7 @@ saves, a quick and easy way is to measure the submission time of just one of the
MPI nodes. This can be achieved by running the application on just one MPI node
with the following environment variables:
\code
\code{.sh}
export STARPU_DISABLE_KERNELS=1
export STARPU_MPI_FAKE_RANK=2
export STARPU_MPI_FAKE_SIZE=1024
......@@ -1095,6 +1095,7 @@ disabled in NewMadeleine by compiling it with the profile
To build NewMadeleine, download the latest version from the website (or,
better, use the Git version to use the most recent version), then:
\code{.sh}
cd pm2/scripts
./pm2-build-packages ./<the profile you chose> --prefix=<installation prefix>
......
......@@ -23,7 +23,7 @@
StarPU can use Simgrid in order to simulate execution on an arbitrary
platform. This was tested with SimGrid from 3.11 to 3.16, and 3.18 to
3.26. SimGrid version 3.25 needs to be configured with -Denable_msg=ON .
3.27. SimGrid version 3.25 needs to be configured with -Denable_msg=ON .
Other versions may have compatibility issues. 3.17 notably does not build at
all. MPI simulation does not work with version 3.22.
......
Binary files a/doc/doxygen/chapters/images/starvz_visu.eps and /dev/null differ
......@@ -29,6 +29,7 @@ program nf_vector
type(c_ptr) :: dh_vb ! a pointer for the 'vb' vector data handle
integer(c_int) :: err ! return status for fstarpu_init
integer(c_int) :: ncpu ! number of cpus workers
integer(c_int) :: bool_ret
allocate(va(5))
va = (/ (i,i=1,5) /)
......@@ -49,6 +50,26 @@ program nf_vector
stop 77
end if
! illustrate use of pause/resume/is_paused
bool_ret = fstarpu_is_paused()
if (bool_ret /= 0) then
stop 1
end if
call fstarpu_pause
bool_ret = fstarpu_is_paused()
if (bool_ret == 0) then
stop 1
end if
call fstarpu_resume
bool_ret = fstarpu_is_paused()
if (bool_ret /= 0) then
stop 1
end if
! allocate an empty perfmodel structure
perfmodel_vec = fstarpu_perfmodel_allocate()
......@@ -73,6 +94,12 @@ program nf_vector
! optionally set 'where' field to CPU only
call fstarpu_codelet_set_where(cl_vec, FSTARPU_CPU)
! set 'type' field to SEQ (for demonstration purpose)
call fstarpu_codelet_set_type(cl_vec, FSTARPU_SEQ)
! set 'max_parallelism' field to 1 (for demonstration purpose)
call fstarpu_codelet_set_max_parallelism(cl_vec, 1)
! add a Read-only mode data buffer to the codelet
call fstarpu_codelet_add_buffer(cl_vec, FSTARPU_R)
......
......@@ -119,11 +119,6 @@ void init_problem_callback(void *arg)
}
}
unsigned get_bcsr_nchildren(struct starpu_data_filter *f, starpu_data_handle_t handle)
{
return (unsigned)starpu_bcsr_get_nnz(handle);
}
void call_filters(void)
{
......@@ -131,7 +126,7 @@ void call_filters(void)
struct starpu_data_filter vector_in_f, vector_out_f;
bcsr_f.filter_func = starpu_bcsr_filter_canonical_block;
bcsr_f.get_nchildren = get_bcsr_nchildren;
bcsr_f.get_nchildren = starpu_bcsr_filter_canonical_block_get_nchildren;
/* the children use a matrix interface ! */
bcsr_f.get_child_ops = starpu_bcsr_filter_canonical_block_child_ops;
......
......@@ -35,13 +35,13 @@ int main()
starpu_worker_get_ids_by_type(STARPU_CPU_WORKER, procs, ncpus);
struct starpu_worker_collection *co = (struct starpu_worker_collection*)malloc(sizeof(struct starpu_worker_collection));
co->has_next = worker_list.has_next;
co->get_next = worker_list.get_next;
co->add = worker_list.add;
co->remove = worker_list.remove;
co->init = worker_list.init;
co->deinit = worker_list.deinit;
co->init_iterator = worker_list.init_iterator;
co->has_next = starpu_worker_list.has_next;
co->get_next = starpu_worker_list.get_next;
co->add = starpu_worker_list.add;
co->remove = starpu_worker_list.remove;
co->init = starpu_worker_list.init;
co->deinit = starpu_worker_list.deinit;
co->init_iterator = starpu_worker_list.init_iterator;
co->type = STARPU_WORKER_LIST;
FPRINTF(stderr, "ncpus %u\n", ncpus);
......
......@@ -44,13 +44,13 @@ int main()
starpu_worker_get_ids_by_type(STARPU_CPU_WORKER, procs, ncpus);
struct starpu_worker_collection *co = (struct starpu_worker_collection*)calloc(1, sizeof(struct starpu_worker_collection));
co->has_next = worker_tree.has_next;
co->get_next = worker_tree.get_next;
co->add = worker_tree.add;
co->remove = worker_tree.remove;
co->init = worker_tree.init;
co->deinit = worker_tree.deinit;
co->init_iterator = worker_tree.init_iterator;
co->has_next = starpu_worker_tree.has_next;
co->get_next = starpu_worker_tree.get_next;
co->add = starpu_worker_tree.add;
co->remove = starpu_worker_tree.remove;
co->init = starpu_worker_tree.init;
co->deinit = starpu_worker_tree.deinit;
co->init_iterator = starpu_worker_tree.init_iterator;
co->type = STARPU_WORKER_TREE;
FPRINTF(stderr, "ncpus %u \n", ncpus);
......
......@@ -103,6 +103,10 @@ module fstarpu_mod
type(c_ptr), bind(C) :: FSTARPU_NL_REGRESSION_BASED
type(c_ptr), bind(C) :: FSTARPU_MULTIPLE_REGRESSION_BASED
type(c_ptr), bind(C) :: FSTARPU_SEQ
type(c_ptr), bind(C) :: FSTARPU_SPMD
type(c_ptr), bind(C) :: FSTARPU_FORKJOIN
! (some) portable iso_c_binding types
type(c_ptr), bind(C) :: FSTARPU_SZ_C_DOUBLE
type(c_ptr), bind(C) :: FSTARPU_SZ_C_FLOAT
......@@ -199,6 +203,12 @@ module fstarpu_mod
subroutine fstarpu_resume() bind(C,name="starpu_resume")
end subroutine fstarpu_resume
! int starpu_is_paused(void);
function fstarpu_is_paused() bind(C,name="starpu_is_paused")
use iso_c_binding, only: c_int
integer(c_int) :: fstarpu_is_paused
end function fstarpu_is_paused
! void starpu_shutdown(void);
subroutine fstarpu_shutdown () bind(C,name="starpu_shutdown")
end subroutine fstarpu_shutdown
......@@ -713,6 +723,18 @@ module fstarpu_mod
type(c_ptr), value, intent(in) :: where ! C function expects an intptr_t
end subroutine fstarpu_codelet_set_where
subroutine fstarpu_codelet_set_type (cl, type_constant) bind(C)
use iso_c_binding, only: c_ptr
type(c_ptr), value, intent(in) :: cl
type(c_ptr), value, intent(in) :: type_constant ! C function expects an intptr_t
end subroutine fstarpu_codelet_set_type
subroutine fstarpu_codelet_set_max_parallelism (cl, max_parallelism) bind(C)
use iso_c_binding, only: c_ptr,c_int
type(c_ptr), value, intent(in) :: cl
integer(c_int), value, intent(in) :: max_parallelism
end subroutine fstarpu_codelet_set_max_parallelism
function fstarpu_perfmodel_allocate () bind(C)
use iso_c_binding, only: c_ptr
type(c_ptr) :: fstarpu_perfmodel_allocate
......@@ -2475,6 +2497,13 @@ module fstarpu_mod
FSTARPU_MULTIPLE_REGRESSION_BASED = &
fstarpu_get_constant(C_CHAR_"FSTARPU_MULTIPLE_REGRESSION_BASED"//C_NULL_CHAR)
FSTARPU_SEQ = &
fstarpu_get_constant(C_CHAR_"FSTARPU_SEQ"//C_NULL_CHAR)
FSTARPU_SPMD = &
fstarpu_get_constant(C_CHAR_"FSTARPU_SPMD"//C_NULL_CHAR)
FSTARPU_FORKJOIN = &
fstarpu_get_constant(C_CHAR_"FSTARPU_FORKJOIN"//C_NULL_CHAR)
! Initialize size constants as 'c_ptr'
FSTARPU_SZ_C_DOUBLE = sz_to_p(c_sizeof(FSTARPU_SZ_C_DOUBLE_dummy))
FSTARPU_SZ_C_FLOAT = sz_to_p(c_sizeof(FSTARPU_SZ_C_FLOAT_dummy))
......
......@@ -536,6 +536,11 @@ void starpu_pause(void);
*/
void starpu_resume(void);
/**
Return !0 if task processing by workers is currently paused, 0 otherwise.
*/
int starpu_is_paused(void);
/**
Value to be passed to starpu_get_next_bindid() and
starpu_bind_thread_on() when binding a thread which will
......
......@@ -314,8 +314,16 @@ void starpu_data_partition_not_automatic(starpu_data_handle_t handle);
Partition a block-sparse matrix into dense matrices.
starpu_data_filter::get_child_ops needs to be set to
starpu_bcsr_filter_canonical_block_child_ops()
and starpu_data_filter::get_nchildren set to
starpu_bcsr_filter_canonical_block_get_nchildren().
*/
void starpu_bcsr_filter_canonical_block(void *father_interface, void *child_interface, struct starpu_data_filter *f, unsigned id, unsigned nparts);
/**
Return the number of children obtained with starpu_bcsr_filter_canonical_block().
*/
unsigned starpu_bcsr_filter_canonical_block_get_nchildren(struct starpu_data_filter *f, starpu_data_handle_t handle)
;
/**
Return the child_ops of the partition obtained with starpu_bcsr_filter_canonical_block().
*/
......
......@@ -670,6 +670,7 @@ struct starpu_task
when using ::STARPU_R and alike.
*/
starpu_data_handle_t handles[STARPU_NMAXBUFS];
/**
Array of Data pointers to the memory node where execution
will happen, managed by the DSM.
......@@ -677,6 +678,7 @@ struct starpu_task
This is filled by StarPU.
*/
void *interfaces[STARPU_NMAXBUFS];
/**
Used only when starpu_codelet::nbuffers is \ref
STARPU_VARIABLE_NBUFFERS.
......@@ -764,6 +766,9 @@ struct starpu_task
already executing. The callback is passed
the value contained in the starpu_task::epilogue_callback_arg field.
No callback is executed if the field is set to <c>NULL</c>.
With starpu_task_insert() and alike this can be specified thanks to
::STARPU_EPILOGUE_CALLBACK followed by the function pointer.
*/
void (*epilogue_callback_func)(void *);
......@@ -835,7 +840,8 @@ struct starpu_task
*/
void *prologue_callback_arg;
/** Optional field, the default value is <c>NULL</c>. This is a
/**
Optional field, the default value is <c>NULL</c>. This is a
function pointer of prototype <c>void (*f)(void*)</c>
which specifies a possible callback. If this pointer is
non-<c>NULL</c>, the callback function is executed on the host
......@@ -848,6 +854,7 @@ struct starpu_task
::STARPU_PROLOGUE_CALLBACK_POP followed by the function pointer.
*/
void (*prologue_callback_pop_func)(void *);
/**
Optional field, the default value is <c>NULL</c>. This is
the pointer passed to the prologue_callback_pop function. This
......
......@@ -88,14 +88,15 @@ extern "C"
#define STARPU_EXECUTE_ON_DATA (7<<STARPU_MODE_SHIFT)
/**
Used when calling starpu_task_in_sert(), must be followed by an array of
Used when calling starpu_task_insert(), must be followed by an array of
handles and the number of elements in the array (as int). This is equivalent
to passing the handles as separate parameters with STARPU_R/W/RW.
to passing the handles as separate parameters with ::STARPU_R,
::STARPU_W or ::STARPU_RW.
*/
#define STARPU_DATA_ARRAY (8<<STARPU_MODE_SHIFT)
/**
Used when calling starpu_task_in_sert(), must be followed by an array of
Used when calling starpu_task_insert(), must be followed by an array of
struct starpu_data_descr and the number of elements in the array (as int).
This is equivalent to passing the handles with the corresponding modes.
*/
......@@ -322,8 +323,7 @@ extern "C"
/**
Used when calling starpu_task_insert() and alike, must be followed
by a void* specifying the value to be set in the sched_data field of the
task.
by a void* specifying the value to be set in starpu_task::sched_data
*/
#define STARPU_TASK_SCHED_DATA (41<<STARPU_MODE_SHIFT)
......@@ -375,7 +375,6 @@ int starpu_task_set(struct starpu_task *task, struct starpu_codelet *cl, ...);
starpu_task_set((task), (cl), STARPU_TASK_FILE, __FILE__, STARPU_TASK_LINE, __LINE__, ##__VA_ARGS__)
#endif
/**
Create a task corresponding to \p cl with the following arguments.
The argument list must be zero-terminated. The arguments
......
......@@ -95,12 +95,30 @@ extern "C"
#endif
/**
When building with a GNU C Compiler, defined to __attribute__((visibility ("internal")))
When building with a GNU C Compiler, defined to __attribute__((visibility ("default")))
*/
#ifdef __GNUC__
# define STARPU_ATTRIBUTE_INTERNAL __attribute__ ((visibility ("internal")))
# define STARPU_ATTRIBUTE_VISIBILITY_DEFAULT __attribute__ ((visibility ("default")))
#else
# define STARPU_ATTRIBUTE_INTERNAL
# define STARPU_ATTRIBUTE_VISIBILITY_DEFAULT
#endif
/**
When building with a GNU C Compiler, defined to #pragma GCC visibility push(hidden)
*/
#ifdef __GNUC__
# define STARPU_VISIBILITY_PUSH_HIDDEN #pragma GCC visibility push(hidden)
#else
# define STARPU_VISIBILITY_PUSH_HIDDEN
#endif
/**
When building with a GNU C Compiler, defined to #pragma GCC visibility pop
*/
#ifdef __GNUC__
# define STARPU_VISIBILITY_POP #pragma GCC visibility pop
#else
# define STARPU_VISIBILITY_POP
#endif
/**
......
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment