In this insertion sort algorithm for example, how would I prove the algorithm's time complexity is O(n^2)?

Question 1

You prove big O complexity by considering how many operations are performed in the worst case. You've done the counting part and entered the results in the right hand column of your image, so what remains to be proven is that the dominant term is O(n^2).

Apart from the terms that involve the sum, your program consists of instructions that get executed n-1 times so these are all O(n) terms.

Now for the terms with the sum. In the worst case a t_j can be j because you may end up decrementing i which you set to j all the way down to 0. So in this worst case we have t_j = j and then you have a term with a sum from 2 to n of j which is O(n^2). This is due to following the mathematical identity:

enter image description here

This can be proven by summing two of these series together, paying attention so that you add two terms that sum to n+1 together, and then dividing the sum by two. Have a look at the proof in wolfram.

Finally, since O((n^2 + n)/2) = O(n^2) you get that the terms that include the sum dominate the runtime and that is why the algorithm is O(n^2)

Question 2

It's O(n^2) because you've got a multiplication, not a sum. Take by example, an array sorted in inverse order:

10 9 8 7 6 5 4 3 2 1

In these cases every iteration of the inner loop will scan and shift the entire sorted subsection of the array before inserting the next element. This gives insertion sort a quadratic running time (i.e., O(n2)).

More info

The best way to understand the complexity is trying to find the worst case example and following the algorithm's steps.

Question 3

Assumption 1: If a procedure P is run no more than T times, and P runs in O(f(N)) each time, then the overall runtime is O(T*f(N))

Assumption 2: If a procedure P is run in O(f(N)) time, and a procedure Q runs in O(g(N)) time, then running P followed by Q requires O(f(N) + g(N)) time.

To be pedantic, assumption 1 follows from assumption 2 if you prefer.

Assumption 3: Assignments and arithmetic are O(1) operations.

Combine these three assumptions, you get your result.

Lines 6 and 7 run in O(1) and O(1) time respectively, so together they run in O(1 + 1) = O(1) time. (By Assumption 3 and 2)

Line 5 determines that {6,7} will run no more than A.Length - 0 times, so the runtime of {5,6,7} is O(A.Length * 1) = O(A.Length) (By Assumption 1)

Lines 2, 4, and 8 run in O(1) time, so {2,4,5,6,7,8} run in O(1 + 1 + A.Length + 1) = O(A.Length)time. (By assumption 3 and 2 again)

Line 1 determines that {2..8} will run no more than A.Length times, so the runtime for lines {1..8} is O(A.Length * A.Length) = O(A.Length ^2) (by Assumption 1).