Understanding of big-O massively improved when I began thinking of orders as sets. How to apply the same approach to big-Theta?

https://cs.stackexchange.com/questions/118671

28-09-2020
|

Question

Today I revisited the topic of runtime complexity orders – big-O and big-$\Theta$. I finally fully understood what the formal definition of big-O meant but more importantly I realised that big-O orders can be considered sets.

For example, $n^3 + 3n + 1$ can be considered an element of set $O(n^3)$. Moreover, $O(1)$ is a subset of $O(n)$ is a subset of $O(n^2)$, etc.

This got me thinking about big-Theta which is also obviously a set. What I found confusing is how each big-Theta order relates to each other. i.e. I believe that $\Theta(n^3)$ is not a subset of $\Theta(n^4)$. I played around with Desmos (graph visualiser) for a while and I failed to find how each big-Theta order relates to other orders. A simple example Big-Theta example graphs shows that although $f(n) = 2n$ is in $\Theta(n)$ and $g(n) = 2n^2$ is in $\Theta(n^2)$, the graphs in $\Theta(n)$ are obviously not in $\Theta(n^2)$. I kind of understand this visually, if I think about how different graphs and bounds might look like but I am having a hard time getting a solid explanation of why it is the way it is.

So, my questions are:

Is what I wrote about big-O correct?
How do big-Theta sets relate to each other, if they relate at all?
Why do they relate to each other the way they do? The explanation is probably derivable from the formal definition of big-Theta (might be wrong here) and if someone could relate the explanation back to that definition it would be great.
Is this also the reason why big-O is better for analysing complexity? Because it is easier to compare it to other runtimes?

Solution

Is what I wrote about big-O correct?

Yes.

How do big-Theta sets relate to each other, if they relate at all?

They are a partition of the space of functions. If $\Theta(f)\cap \Theta(g)\not = \emptyset$, then $\Theta(f)=\Theta(g)$. Moreover, $\Theta(f)\subseteq O(f)$.

Why do they relate to each other the way they do? The explanation is probably derivable from the formal definition of big-Theta (might be wrong here) and if someone could relate the explanation back to that definition it would be great.

A function $f$ is in $\Theta(g)$ if and only if there are constants $c_1,c_2>0$ such that $c_1 g(n)\leq f(n) \leq c_2g(n)$ for all sufficiently large $n$. Seeing that the above relation holds is a simple case of doing some substitutions:

Suppose there is some $a\in \Theta(f), a\in \Theta(g)$ and $b\in \Theta(f)$, then we know there exist constants such that (for sufficiently large $n$)

$c_1 f(n)\leq a(n) \leq c_2f(n)$

$c_3 g(n)\leq a(n) \leq c_4g(n)$

$c_5 f(n)\leq b(n) \leq c_6f(n)$

then

$c_5 c_3 g(n)/c_2 \leq c_5 a(n)/c_2 \leq c_5 f(n)\leq b(n)\leq c_6f(n)\leq c_6 a(n)/c_1\leq c_6c_4g(n)/c_1$

and thus $b\in \Theta(g)$.

Is this also the reason why big-O is better for analysing complexity? Because it is easier to compare it to other runtimes?

It is not "better". You could say it is worse, because an algorithm being $\Theta(f)$ implies that it is $O(f)$ (but not vice-versa), so "$\Theta$" is a strictly stronger statement than "$O$".

The reason "$O$" is more popular is because "$O$" expresses an upper bound on the speed of an algorithm, i.e., it is a guarantee it will run in at most a given time. "$\Theta$" also expresses the same upper bound, but, in addition, also expresses that this upper bound is the best possible upper bound for a given algorithm. E.g., an algorithm running in time $O(n^3)$ can actually turn out to also run in $O(n^2)$, but an algorithm running in time $\Theta(n^3)$ can not also run in $\Theta(n^2)$ time.

From a practical perspective, if we want to know whether an algorithm is fast enough for a practical purpose, knowing it runs in $O(n^2)$ time is good enough to determine whether it is fast enough. The information that it runs in $\Theta(n^2)$ time is not really important to the practical use. If we have determined that an $O(n^2)$-time algorithm is fast enough for our application, then who cares if the algorithm that was claimed to be $O(n^2)$ is actually $\Theta(n)$?

Obviously, if you are going to give an upper bound on the running time of an algorithm, you will endeavor to give the best possible upper bound (there is no sense in saying your algorithm is $O(n^3)$ when you could also say it is $O(n^2)$). For this reason, when people say "$O$" they often implicitly mean "$\Theta$". The reason people write "$O$" is because this is easier on a normal keyboard, is customary, conveys the most important information about the algorithm (the upper bound on the speed) and people often can't be bothered to formally prove the lower bound.

OTHER TIPS

I'd just like to add some to the already posted answer and comments.

I'm currently studying CS, and every single time big O notation came up, the lecturer at some point mentioned that they in fact are sets, and that $f(n) = O(g(n))$ really is just wrong notation used for historical reasons, and $\in$ is what would be correct.

In fact, the way we defined big-O notation from day 0 was explicitly as sets of functions.

Another thing I'd like to mention is using big O for runtimes.

Getting upper bounds for problems tends to be way easier than lower bounds. For an upper bound, all you need is to find an algorithm. For a lower bound, you need to prove that no algorithm that solves the problem can run faster than some limit.

As an example, proving that arrays can be sorted by comparing elements in $O(n \; \log(n))$ is easy, just point at mergesort. Proving $\Omega(n \; \log(n))$ is not nearly as easy, though it can be done (has been done, remember I'm talking about comparison-based sorting).

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange