Question

Let's say I have a database:

  • k articles
  • each article has a title with l words in it
  • my search query has m tokens

To search for each of my tokens in the titles of all articles is m * k * l

Am I right in thinking that this is O(n) ?

Was it helpful?

Solution

In what follows, I'm going to assume that each word has length w or less.

It's not mathematically valid to talk about O(n) time unless n is defined somewhere. If you interpret n to be the total length of the input given to you (the amount of bits needed to write out all the articles and the search query), then you'd get that n = (kl + m)w.

Notice that your algorithm does not run in time O(mlk) unless w is a constant. More accurately, it's O(mlkw). Since n = lkw + mw, your runtime would not be O(n) under this interpretation of what n is.

That said - you can improve the runtime of your algorithm significantly by using better data structures. If you build a trie containing all of your words (which takes time mw), then you can look up each word in time O(w). This means that since there are kl total words to consider, your search time would be O(mw + lkw), which is linear.

Hope this helps!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top