add an option to CUDA workers do not to do slow allocations on other nodes

This creates the option STARPU_CUDA_ONLY_FAST_ALLOC_OTHER_MEMNODES. When 1, CUDA workers will not perform slow allocations (RAM and pinned). This also changes the behavior of may_alloc in some functions, adding the parameter 2, can do allocations but only fast ones.
3 jobs for !16 with data_requests_more_prio in 130 minutes and 38 seconds (queued for 21 minutes and 44 seconds)
detached
Status Job ID Name Coverage
  Build
passed #1000539
build

00:07:11

 
  Deploy
failed #1000540
check

01:05:00

passed #1000541
simgrid

00:58:26

 
Name Stage Failure
failed
check Deploy The script exceeded the maximum execution time set for the job
PASS: mpi_lu/plu_outofcore_example_double
PASS: matrix_decomposition/mpi_cholesky
PASS: matrix_decomposition/mpi_cholesky_distributed
PASS: cg/cg
PASS: matrix_mult/mm

Session terminée, l'interpréteur est en train d'être tué… … tué.
WARNING: Timed out waiting for the build to finish
ERROR: Job failed: execution took longer than 1h0m0s seconds