Added support for HIP and hipblas (CUDA and ROC backend)
- We should review (among other things) the codelets definition in
runtime_codelets.h
as I wasn't sure how we'd want to add hip support in the macro - Only trsm and gemm has been ported to HIP but more can easily be added
- The trsm and gemm are forced on HIP devices for now (no support for perfmodel in starpu yet)
- I'm not sure what
runtime_zlocality.c
is used for so I didn't had HIP support for now - This merge request has been tested against the hip_device_async starpu branch (not merged yet but merge request in progress here)