Check list of the PaStiX 6 implementation of the CERFACS customized features in PaStiX 5
For the effective integration of a sequential threadsafe version of PaStiX 5 as a routine called from an OpenMP region of a hybrid MPI+OpenMP application we had to customize both the sources and the compilation options of PaStiX 5.
Just to be more than sure that no customization is needed in PaStiX 6, here is a list of what we had to do
-
purely sequential version of PaStiX 5
This was obtained by setting
-DFORCE_NOMPI
-DFORCE_NOSMP
and removing
-DCUDA_SM_VERSION=...
at compilation.
Is it now, simply enough to set iparm(IPARM_THREAD_NBR) = 1
and iparm(IPARM_VERBOSE) = PastixVerboseNot
to avoid any interference or rush condition?
-
activation of multiple RHS
We had to explicitly activate
-DMULT_SMX
at compilation. I guess this is not necessary anymore (See issue #13 (closed)).
-
algebra on multiple RHS
Moreover, working with @faverge on the specific topic, we concluded that using BLAS2 for the operations on the multiple RHS was counterproductive ifnrhs
was actually set to 1. Is the specific case now handled separately? -
memory management for multiple RHS
In the same occasion we noticed a great performance improvement if theSTORAGE
mode was activated, which it was NOT by default. How has this aspect been ported to PaStiX 6? Is it a parametered choice ? -
dependence on the non threadsafe section of Scotch 6.0.4
A single treatment inside Scotch is not threadsafe. We made it critical by an OpenMP pragma, while in PaStiX 6 it is explicitly handled as atomic. Has this feature been tested in an intensive OpenMP application?
My tests up to 32 threads all passed once, but the bug is not systematic, therefore an extensive validation, also on the impact on performances is required.
By the way, is there any release announcement for a threadsafe Scotch?