I finally figured it out myself.
FLOPs are counted a little different than algorithmic complexity in the sense that lower order terms are still ignored but the coefficient in front of the highest order term does matter.
In this specific example, since we ignore lower order terms, we only look at +, -, *, /
operations in the triple nested loop and ignore other floating point operations in the rest of the algorithm. i.e. the following line
A(i,k+1:n) = A(i,k+1:n)-mult*A(k,k+1:n);
- first loop runs from 1 to n
- second loop runs from k to n
- third loop runs from k to n (implicit in Matlab code with the use of
:
)
Thus, this line runs almost n^3
times and precisely n*n + (n-1)*(n-1) + ... + 2*2 + 1*1
times which is equivalent to (1/3)*n^3
flops when ignoring lower order terms.
However, this line has two floating point operations: a -
operation and a *
operation.
Therefore, this gives (2/3)*n^3
.