Added SPMV for Cuda
Implemented SPMV for CSR matrix (A) and dense vectors (X, Y) It relies on cuSparse, similarly to how other kernels relies on cuBlas. Accesses are implemented as segments (a CSR matrix = 3 intervals (rows_nnz, col_indices, values))