Use the Kahan summation algorithm
- It has the same algorithmic complexity as a naive summation
- It will greatly increase the accuracy of a summation, without requiring you to switch data types to double.
By rewriting your code to implement it:
for(int row = 0; row < tan_hd.rows; row++) {
float sum = 0.0, c = 0.0;
#pragma omp parallel for reduction(+:sum, +:c)
for(int column = 0; column < tan_hd.cols; column++) {
float y = tan_hd.data[column * tan_hd.rows + row] - c;
float t = sum + y;
c = (t - sum) - y;
sum = t;
}
sum = sum - c;
printf("row %d: %f", row, sum);
}
You can additionally switch all float
to double
to achieve a higher precision, but since your array is a float
array, there should only be differences in the number of signficant numbers at the end.