Nested loop complexity

https://softwareengineering.stackexchange.com/questions/416129

16-03-2021
|

Pergunta

I am working through MIT 6.006 OpenCourseWare as taught in Fall 2011. Problem 1.2c asks for the time complexity of an algorithm¹ which finds a peak element (i.e. all neighbors are less than or equal) of an M x N matrix. My complexity analysis does not match theirs and appears to hinge on the complexity of a nested loop.

The algorithm creates a cross which divides the matrix into four "subproblems". It finds the max on the cross, checks neighbors, and recurses as needed:

def algorithm3(problem, bestSeen = None, trace = None):
    # if it's empty, we're done 
    if problem.numRow <= 0 or problem.numCol <= 0:
        return None

    midRow = problem.numRow // 2
    midCol = problem.numCol // 2

    # first, get the list of all subproblems
    subproblems = []

    (subStartR1, subNumR1) = (0, midRow)
    (subStartR2, subNumR2) = (midRow + 1, problem.numRow - (midRow + 1))
    (subStartC1, subNumC1) = (0, midCol)
    (subStartC2, subNumC2) = (midCol + 1, problem.numCol - (midCol + 1))

    subproblems.append((subStartR1, subStartC1, subNumR1, subNumC1))
    subproblems.append((subStartR1, subStartC2, subNumR1, subNumC2))
    subproblems.append((subStartR2, subStartC1, subNumR2, subNumC1))
    subproblems.append((subStartR2, subStartC2, subNumR2, subNumC2))

    # find the best location on the cross (the middle row combined with the
    # middle column)
    cross = []

    cross.extend(crossProduct([midRow], range(problem.numCol)))
    cross.extend(crossProduct(range(problem.numRow), [midCol]))

    crossLoc = problem.getMaximum(cross, trace)
    neighbor = problem.getBetterNeighbor(crossLoc, trace)

    # update the best we've seen so far based on this new maximum
    if bestSeen is None or problem.get(neighbor) > problem.get(bestSeen):
        bestSeen = neighbor
        if not trace is None: trace.setBestSeen(bestSeen)

    # return if we can't see any better neighbors
    if neighbor == crossLoc:
        if not trace is None: trace.foundPeak(crossLoc)
        return crossLoc

    # figure out which subproblem contains the largest number we've seen so
    # far, and recurse
    sub = problem.getSubproblemContaining(subproblems, bestSeen)
    newBest = sub.getLocationInSelf(problem, bestSeen)
    if not trace is None: trace.setProblemDimensions(sub)
    result = algorithm3(sub, newBest, trace)
    return problem.getLocationInSelf(sub, result)

The instructor provides the complexity for getMaximum as O(len(locations)), getBetterNeighbor and getLocationInSelf as O(1), getSubproblemContaining as O(len(boundList)), and all trace calls as O(1). The crossProduct is computed as:

def crossProduct(list1, list2):
    answer = []
    for a in list1:
        for b in list2:
            answer.append ((a, b))
    return answer

The solution states, "a single call of the function (not counting the recursive call) does work proportional to m + n." I don't understand this.

Isn't crossProduct O(mn)?

My reasoning is that for an M x N matrix, getMaximum must traverse the dividing cross (one row, one column) which contributes O(m + n). The getSubproblemContaining contributes something linear, O(m) or O(n). Everything else besides crossProduct is O(1), the complexity of crossProduct not being provided, so that the recurrence relation is

T(m, n) = O(mn) + O(m + n) + cO(n) + T(m/2, n/2)   for some constant c
        = O(mn) + T(m/2, n/2)

The recurrence reduces via the geometric series to O(m + n),

T(m, n) = O(mn) + O(m + n)
        = O(mn)

which yields T(n,n) = O(n^2). The solution provided is O(n). The crossProduct term appears to be the discrepancy.

¹ The algorithm/code implementation is written by the instructor. All Pythonic style errors are theirs (and likely made for pedagogical reasons).

Solução

Don't forget what n and m actually are.

When you say that this function:

def crossProduct(list1, list2):
    answer = []
    for a in list1:
        for b in list2:
            answer.append ((a, b))
    return answer

takes O(mn) time, what are m and n? Well, m is the size of list1 and n is the size of list2 (or vice versa).

When you say that algorithm3 takes O(mn) time, what are m and n? Well, m is the number of rows and n is the number of columns (or vice versa).

Hang on! Those aren't the same thing! We should use different names for different variables. It would be easier if we said that crossProduct has O(ab) complexity where a is the size of list1 and b is the size of list2 (or vice versa). You can't give two different things the same variable name and then just assume they are the same thing.

So what are a and b?

cross.extend(crossProduct([midRow], range(problem.numCol)))
cross.extend(crossProduct(range(problem.numRow), [midCol]))

Okay, so we call it once with a=1 and b=n, and we call it again with a=m and b=1. So these two calls together have O(1n + m1) = O(n + m) complexity.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a softwareengineering.stackexchange