The answer is alright in the cublas documentation, as you posted it - cublasgeam() - If C overlaps A or B, then behaviour is undefined
.
Nvidia won't guarantee that this will work, if C == A
.
Can input matrices also be used to store the output matrix with CUBLAS?
Pergunta
For instance, cublas<t>geam()
will do:
But what if I want to store the result in A
anyway? Can I call it with pointers *C = *A
so that:
without fear that I may be writing output to a matrix still being read as an input??
If so, are guaranteed that we do this with all other CUBLAS matrix operations safely?
Solução
Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow