- Dec 01, 2016
-
-
PRUVOST Florent authored
- use (starpu_cpu_func_t) 1 trick, same as cuda_func - cpu funtions are not defined anymore avoiding the dependency to coreblas - add #if !defined(CHAMELEON_SIMULATION) where it is needed - remove dependency to the coreblas library (become useless) - remove useless simucblas, simulapacke libraries - remove CHAMELEON_SIMULATION_MAGMA cmake variable and definition - keep using CHAMELEON_USE_CUDA and CHAMELEON_USE_MAGMA to consider CUDA kernels - this avoid to introduce useless new variables - work on messages
-
- Nov 30, 2016
-
-
Mathieu Faverge authored
-
- Oct 29, 2016
-
-
Guillaume Sylvand authored
to change the default values for tag_width(=31) and tag_sep(=24). Today, it serves only with starPU.
-
- Oct 12, 2016
-
-
Guillaume Sylvand authored
-
Guillaume Sylvand authored
To use your own progress indicator, - Define a function with prototype "void my_update_progress(int currentValue, int maximumValue)" - Pass it to chameleon with "ierr=MORSE_Set_update_progress_callback(my_update_progress)" - Activate progress indicator with MORSE_Enable(MORSE_PROGRESS)
-
Guillaume Sylvand authored
morse_desc_init_user(): If one of the function get_* is NULL, we switch back to the default, like in morse_desc_init() This allows to change only 1 or 2 of the 3 functions, and keep the other unchanged. For example, in OOC, only the 1st is modified (to always return NULL).
-
Guillaume Sylvand authored
Remove large blocks of duplicated code by calling morse_desc_init_user() from morse_desc_init() and morse_desc_init_diag()
-
- Sep 22, 2016
-
-
Guillaume Sylvand authored
to avoid reverse dependency libcoreblas->libchameleon set_coreblas_gemm3m_enabled()/get_coreblas_gemm3m_enabled() allow to set/get this variable.
-
- Sep 20, 2016
-
-
Guillaume Sylvand authored
This routine, available in MKL, does a product in 6n^3 ops instead of 8n^3 but is interesting only for "large enough" matrices (to be tested...) Potentially, we gain 25 % in all complex computations. It could be interesting to look for it / implement it in cuda. !!! Note that the flop counters are not updated !!! !!! In C/Z accuracy, most flops counter should be x0.75 !!! IT is OFF by default It is activated with MORSE_Enable(MORSE_GEMM3M) In the timing routines, it is activated with --gemm3m
-
Guillaume Sylvand authored
IT is OFF by default It is activated with MORSE_Enable(MORSE_PROGRESS) In the timing routines, it is activated with --progress No progress is printed for tasks faster than 10 seconds
-
- Sep 09, 2016
-
-
PRUVOST Florent authored
-
- Sep 08, 2016
-
-
Guillaume Sylvand authored
The new codelet is added for all 3 runtimes but only tested with starpu ;-) An example test7 is added : it is a copy of test6 that uses the new function to build the matrices
-
- Aug 25, 2016
-
-
PRUVOST Florent authored
-
- May 25, 2016
-
-
PRUVOST Florent authored
-
- Mar 17, 2016
-
-
THIBAULT Samuel authored
-
- Feb 25, 2016
-
-
THIBAULT Samuel authored
-
- Dec 09, 2015
-
-
PRUVOST Florent authored
-
- Dec 01, 2015
-
-
PRUVOST Florent authored
-
- Nov 15, 2015
-
-
Mathieu Faverge authored
-
- Nov 04, 2015
-
-
Mathieu Faverge authored
-
- Nov 03, 2015
-
-
Mathieu Faverge authored
-
- Oct 05, 2015
-
-
PRUVOST Florent authored
-
- Oct 03, 2015
-
-
Mathieu Faverge authored
Fix a large bunch of warnings, and there are still mistakes that should be fixed before the SC release
-
Mathieu Faverge authored
-
Mathieu Faverge authored
-
- Sep 17, 2015
-
-
THIBAULT Samuel authored
-
THIBAULT Samuel authored
instead of introducing RUNTIME_distributed_barrier
-
PRUVOST Florent authored
add cudablas library to make calls to cuda kernels (magma here, cublas will follow), no more calls to magma in runtime/starpu codelets
-
- Sep 16, 2015
-
-
THIBAULT Samuel authored
-
THIBAULT Samuel authored
MORSE_Distributed_size, MORSE_Distributed_rank so that applications do not hardcode the use of MPI. Introduce RUNTIME_distributed_rank, RUNTIME_distributed_size, RUNTIME_distributed_barrier, so that MORSE does not hardcode the use of MPI either. This allows to use simgrid-mpi.
-
- Jul 28, 2015
-
-
PRUVOST Florent authored
-
PRUVOST Florent authored
-
- Jun 19, 2015
-
-
PRUVOST Florent authored
-
- Jun 18, 2015
-
-
PRUVOST Florent authored
-
- May 22, 2015
-
-
PRUVOST Florent authored
-
- May 19, 2015
-
-
PRUVOST Florent authored
add MORSE_Pause/Resume function to avoid CPU consumption when no tasks have to be executed - call starpu_pause/resume, do nothing for quark (no need)
-
- Feb 11, 2015
-
-
PRUVOST Florent authored
-
PRUVOST Florent authored
in MORSE_Desc_Create_User we can use the matrix in mat field without allocating it --> avoid to free it in this case, users could need it and will manage deallocation
-
- Feb 05, 2015
-
-
PRUVOST Florent authored
change the way we include our own header files --> relative to the root - when plasma is in the same env, chameleon can take some headers not belonging to it (ex: #include descriptor.h, this file states in plasma install dir also) which make compilation errors
-
- Feb 03, 2015
-
-
PRUVOST Florent authored
-