Allocate panel memory manually if it is the read node
This will not let StarPU allocate this particular handle because we know that this data will never be in GPU; a simple malloc can be used instead of the CUDA_HOST_MALLOC, which is slow. Also, the free method submits a callback for the handle and then frees the memory and asks for a starpu invalidate.