Mentions légales du service

Skip to content
Snippets Groups Projects
Commit f960e4d8 authored by hhakim's avatar hhakim
Browse files

Try to optimize GivensFGFTParallel::update_L() by using OpenMP to parallelize...

Try to optimize GivensFGFTParallel::update_L() by using OpenMP to parallelize the product of each parallel Givens matrix with L but it was unfruitful.

The idea was to give to each concurrent/parallel thread one submatrix of the Givens matrix to multiply (the rotation part) then continue until having exhausted the submatrices (and finally finished to compute the whole product the parallel way).

The OpenMP directives are kept in the code but disabled. The compilation constant OPT_UPDATE_L_OMP must be set to enable the use of OMP.
Besides, -fopenmp flags have to be passed to cmake (or in CMakeCache.txt for both linker and compile flags), likewise for setup.py when compiling the python wrapper (extra_compile_flags, extra_link_flags) and something similiar for matlab wrapper compil.

The probably cause for the inefficience of OpenMP here is the memory workload (uncontiguous accesses) of each thread which is not a good deal compared to the parallelization of the matrix product allowed by OpenMP.
parent 3acd33ab
No related branches found
No related tags found
No related merge requests found
Loading
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment