Add a generic lacpy codelet on CPU/CUDA workers
Add a generic copy codelet to be used in the case m == n, displA = displB = 0 to perfrom copies on CPU and GPU through the interface dat cpy function.
Edited by Mathieu Faverge
GitLab upgrade completed. Current version is 17.11.6.
Add a generic copy codelet to be used in the case m == n, displA = displB = 0 to perfrom copies on CPU and GPU through the interface dat cpy function.