what type of maths is required to understand algorithm time and space complexities?

https://stackoverflow.com/questions/12026883

27-06-2021
|

Question

I've been applying for jobs and every time I hear questions about algorithm time/space complexity, I cringe and stumble. No matter how much I read, my brain seems to be programmed to not get any of it, and I think the reason is down to the fact I have a very low mathematical background due to skipping school. This may not be the usual S.O question, potentially even be removed due to being fundamentally about the maths, but at least I'm hoping I'll figure out where to go next with this question.

Solution

I don't know why job people would go into that, so here's just a few examples. The whole "complexity" thing is just to provide an indication of how much time (or memory) the algorithm uses.

Now, if you have an array with values, accessing the value at a given index is O(1) -- constant. It doesn't matter how many elements are the in array, if you have an index you can get the element directly.

If, on the other hand, you are looking for a specific value, you'll have no choice but to look at every element (at least until you find the one, but that doesn't matter for the complexity thing). Thus, searching in a random array is O(n): the runtime corresponds to the number of elements.

If, on the other hand, you have sorted array then you can do a "binary search", which would be O(log n). "Log n" is twos-logarithm, which is basically the inverse of 2^n. For example, 2^10 is 2*2*2*2...*2 10 times = 1024, and log2(1024) is 10. Thus, algorithms with O(log n) are generally considered pretty good: to find an element in a sorted array using binary search, if the array has up to 1024 elements then the binary search will only have to look at 10 of them to find any value. For 1025-2048 elements it would be 11 peeks at most, for 2049-4096 it would 12 and so on. So, adding more elements will only slowly increase the runtime.

Of course, things can get a lot worse. A trivial sorting algorithm tends to be O(n**2), meaning that it needs 2^2 = 4 "operations" for an array with just 2 elements, 3^2 = 9 if the array has 3, 4^2 = 16 if the array has 4 elements and so on. Pretty bad actually, considering that an array with just 1000 elements would already require 1000*1000 = 1 million compares to sort. This is called exponential growth, and of course it can get even worse: O(n^3), O(n^4) etc. Increasingly bad.

A "good" sorting algorithm is O(n*log n). Assuming an array with 1024 elements, this would be 1024*10=10240 compares --- much better than the 1 million we had before.

Just take these O(...) as indicators for the runtime behavior (or memory footprint if applied to memory). I did plug in real numbers to you can see how the numbers change, but those are not important, and usually these complexities are worst case. Nonetheless, by just looking at the numbers, "constant time" is obviously best, exponential is always bad because runtime (or memory use) skyrockets really fast.

EDIT: also, you are not really interested in constant factors; you don't usually see "O(2n)". This would still be "O(n)" -- the runtime relates directly to the number of elements.

OTHER TIPS

To analyze the time/space complexity of an algorithm - a high school knowledge should be fine. I studied this in Uni. during my first semester and I was just fine.

The fields of interests for the basics are:

basic calculus (understanding what a "limit" and asymptote are)
Series calculations (especially sum of arithmetic progression is often used)
Basics knowledge of combinatorics ("how many options there are to...?")
Basics of polynomial arithmetrics (p(x) + q(x) = ?)

The above is true for analyzing complexity of algorithms. Calculating complexity of problems is much deeper field, which is still in research - theory of complexity. This requires extensive knowledge on set theory, theory of computation, advanced calculus, linear algebra and much more.

While knowing something about calculus, summing series, and discrete mathematics are all Good Things, from your question and from my limited experience in industry, I'd doubt that your interviewers are expecting that level of understanding.

In practice, you can make useful big-O statements about time and space complexity without having to do much mathematical thinking. Here are the basics, which I'll talk about in terms of time complexity just make the language less abstract.

A big-O time complexity tells you how the worst-case running time of your algorithm scales with the size of its input. The actual numbers you get from the big-O function are an indication of the number of constant time operations your algorithm will perform on a given size of input.

A big-O function, therefore, simply counts the number of constant time operations your algorithm will perform.

A constant time operation is said to be O(1). [Note that any fixed length sequence of constant time operations is also O(1) since the sequence also takes a constant amount of time.]

O(k) = O(1), for any constant k.

If your algorithm performs several operations in series, you sum their costs.

O(f) + O(g) = O(f + g)

If your algorithm performs an operation multiple times, you multiply the cost of the operation by the number of times it is performed.

n * O(f) = O(n * f)

O(f) * O(f) * ... * O(f) = O(f^n), where there are n terms on the left hand side

A classic big-O function is log(n), which invariably corresponds to "the height of the balanced tree containing n items". You can get away with just knowing that sorting is O(n log(n)).
Finally, you only report the fastest growing term in a big-O function since, as the size of the input grows, this will dominate all the other terms. Any constant factors are also discarded, since we're only interested in the scaling properties of the result.

E.g., O(2(n^2) + n) = O(n^2).

Here are two examples.

Bubble Sorting n Items

Each traversal of the items sorts (at least) one item into place. We therefore need n traversals to sort all the items.

O(bubble-sort(n)) = n * O(traversal(n))
                  = O(n * traversal(n))

Each traversal of the items involves n - 1 adjacent compare-and-swap operations.

O(traversal(n)) = (n - 1) * O(compare-and-swap)
                = O((n - 1) * O(compare-and-swap))

Compare-and-swap is a constant time operation.

O(compare-and-swap) = O(1)

Collecting our terms, we get:

O(bubble-sort(n)) = O(n * (n - 1) * 1)
                  = O(n^2 - n)
                  = O(n^2)

Merge Sorting n Items

Merge-sort works bottom-up, merging items into pairs, pairs into fours, fours into eights, and so on until the list is sorted. Call each such set of operations a "merge-traversal". There can be at most log_2(n) merge traversals since n = 2 ^ log_2(n) and at each level we are doubling the sizes of the sub-lists being merged. Therefore,

O(merge-sort(n)) = log_2(n) * O(merge-traversal(n))
                 = O(log_2(n) * merge-traversal(n))

Each merge-traversal traverses all the input data once. Each input item is the subject of at least one compare-and-select operation and each compare-and-select operation chooses one of a pair of items to "emit". Hence

O(merge-traversal(n)) = n * O(compare-and-select)
                      = O(n * compare-and-select)

Each compare-and-select operation takes constant time:

O(compare-and-select) = O(1)

Collecting terms, we get

O(merge-sort(n)) = O(log_2(n) * n * 1)
                 = O(n * log_2(n))
                 = O(n * log(n)), since change of log base is 
                                  multiplication by a constant.

Ta daaaa!

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow