CUDA - Simple matrix addition/sum operation
Frage
This should be very simple but I could not find an exhaustive answer:
I need to perform A+B = C with matrices, where A and B are two matrices of unknown size (they could be 2x2 or 20.000x20.000 as greatest value)
Should I use CUBLAS with Sgemm function to calculate?
I need the maximum speed achievable so I thought of CUBLAS library which should be well-optimized
Keine korrekte Lösung
Lizenziert unter: CC-BY-SA mit Zuschreibung
Nicht verbunden mit StackOverflow