質問

I have the follow algorithm which finds duplicates and removes them:

public static int numDuplicatesB(int[] arr) {
    Sort.mergesort(arr);
    int numDups = 0;
    for (int i = 1; i < arr.length; i++) {
        if (arr[i] == arr[i - 1]) {
            numDups++;
} }
    return numDups;
}

I am trying to find the worst case time complexity of this. I know mergesort is nlog(n), and in my for loop I am iterating over the entire data set so that would count as n. I am unsure what to do with these numbers though. Should I just sum them together? If I were to do that, how would I do it?

役に立ちましたか?

解決

O(n) + O(n log(n)) = O(n log(n))

For Big O complexity, all you care about is the dominant term. n log(n) dominates n so that's the only term that you care about.

他のヒント

Let's reason our way through it and remember the definition of O. The one I'm going to use is for the limit at infinity.

You are correct in stating that you perform two operations with corresponding asymptotic bounds of O(n) and O(nlog(n)) but combining them into a single bound is not as simple as adding the two functions. You know your function takes at least O(n) time and also at least O(nlog(n)) time. So really the complexity class for your function is the union of O(n) and O(nlog(n)) but O(nlog(n)) is a superset of O(n) so really it is just O(nlog(n)).

If you were going to set it out longhand it would look roughly like this:

Suppose the total time is: a n + b n log(n), where a and b are constants (ignoring lower order terms).

As n goes to infinity (a n + b n log (n)) / n log (n) -> a / log (n) + b -> b

So the total time is O(b n log(n)) = O(n log (n)).

Start with the definition of O ():

O (n log n) means "less than C n log n, if n is large".

O (n) means "less than D n, if n is large".

If you add both, the result is less than C n log n + D n < C n log n + D n log n < (C + D) n log n = O (n log n).

In general, if f (n) > C g (n) for large n and some C > 0, then O (f (n)) + O (g (n)) = O (f (n)). And after doing a few cases using the definition of O (), you'll know what you can and can't do.

The big O notation is defined as a set:

enter image description here

So enter image description here contains all functions that are - starting from some arbitrary large point enter image description here - always smaller than g.

Now, when you have a function that is in enter image description here and then execute another one that increases slower than g it is certainly increasing slower than 2g. So executing anything slower than g will not change the complexity class.

More formally:

f, h \in \mathcal{O}(g) \Rightarrow (f+h) \in \mathcal{O}(g)

You can easily prove that.

TL;DR

It is still n log(n)

ライセンス: CC-BY-SA帰属
所属していません softwareengineering.stackexchange
scroll top