سؤال

I am trying to use CUBLAS to sum two big matrices of unknown size. I need a fully optimized code (if possible) so I chose not to rewrite the matrix addition code (simple) but using CUBLAS, in particular the cublasSgemm function which allows to sum A and C (if B is a unit matrix): *C = alpha*op(A)*op(B)+beta*c*

The problem is: C and C++ store the matrices in row-major format, cublasSgemm is intended (for fortran compatibility) to work in column-major format. You can specify whether A and B are to be transposed first, but you can NOT indicate to transpose C. So I'm unable to complete my matrix addition..

I can't transpose the C matrix by myself because the matrix is something like 20000x20000 maximum size.

Any idea on how to solve please?

لا يوجد حل صحيح

مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top