CUDA - Simple matrix addition/sum operation
Question
This should be very simple but I could not find an exhaustive answer:
I need to perform A+B = C with matrices, where A and B are two matrices of unknown size (they could be 2x2 or 20.000x20.000 as greatest value)
Should I use CUBLAS with Sgemm function to calculate?
I need the maximum speed achievable so I thought of CUBLAS library which should be well-optimized
No correct solution
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow