Added support for HIP and hipblas (CUDA and ROC backend)
- We should review (among other things) the codelets definition in
runtime_codelets.has I wasn't sure how we'd want to add hip support in the macro - Only trsm and gemm has been ported to HIP but more can easily be added
- The trsm and gemm are forced on HIP devices for now (no support for perfmodel in starpu yet)
- I'm not sure what
runtime_zlocality.cis used for so I didn't had HIP support for now - This merge request has been tested against the hip_device_async starpu branch (not merged yet but merge request in progress here)