Question

I have a partially ordered set, say A = [x1, x2, ...], meaning that for each xi and xj in the set, (exactly) one of four possibilities is true: xi < xj, xi == xj, xi > xj, or xi and xj are incomparable.

I want to find the maximal elements (i.e., those elements xi for which there are no elements xj with xi < xj). What is an efficient algorithm to do this (minimize the number of comparisons)? I tried building a DAG and doing a topological sort, but just building the graph requires O(n^2) comparisons, which is too many.

I'm doing this in Python, but if you don't know it I can read other languages, or pseudocode.

Was it helpful?

Solution

It seems the worst case is O(n^2) no matter what you do. For example, if no elements are comparable, then you need to compare every element to every other element in order to determine that they are all maximal.

And if you allow O(n^2), since the ordering is transitive, you can just make one pass through the set, keeping a list of all elements that are maximal so far; each new element knocks out any maximal elements that are < it and gets added to the maximal list if it is not < any maximal element.

OTHER TIPS

As other answers have pointed out, the worst case complexity is O(n^2).

However, there are heuristics that can help a lot in practice. For example if the set A is a subset of Z^2 (integer pairs), then we can eliminate a lot of points upfront by:

  1. Sorting along the x-axis (for a given x-value say 1, find the point with max y-value, repeat for all x-values) to get a candidate set of maximals, call it y-maximals.
  2. Similarly get the set x-maximals.
  3. Intersect to get final candidate set xy-maximals.

This is of cost O(n). It is easy to see that any maximal point will be present in xy-maximals. However, it can contain non-maximal points. For example, consider the set {(1,0), (0,1), (2,2)}.

Depending on your situation, this may be a good enough heuristic. You can follow this up with the exhaustive algorithm on the smaller set xy-maximals.

More generally, this problem is called the 'Pareto Frontier' calculation problem. Here are good references:

http://www.cs.yorku.ca/~jarek/papers/vldbj06/lessII.pdf

https://en.wikipedia.org/wiki/Pareto_efficiency#Use_in_engineering_and_economics

In particular the BEST algorithm from the first reference is quite useful.

In the worst case, you can't be faster than O(n^2). Indeed to check that all element are maximal for the poset where no element are comparable, you need to compare every pairs of elements. So it's definitely quadratic in the worst case.

Let me clarify to answer the comment below : I'm claiming that the worst case is attained when the poset is the trivial poset where no two elements are comparable. In this case, all elements are maximal. To check that this is indeed the case, any algorithm doing comparison must perform all n(n+1)/2 comparisons. Indeed, if a comparison say a <-> b is not performed, then the algorithm can't distinguish the trivial poset with the poset where the only relation is a < b so it can't give the correct answer. So any algorithm must be at least quadratic in the worst case.

Suppose you have looked at all (n choose 2) comparisons except for one, between xi and xj, i != j. In some scenarios, the only two candidates for being maximal are exactly these two, xi and xj.

If you do not compare xi and xj, you cannot definitively say whether they are both maximal, or whether only one of them is.

Therefore, you must check all possible (n choose 2) (O(n2)) comparisons.


Note this assumes your partially ordered set is specified with a black box that will do a comparison. If the partially ordered set is given as a graph to start with, you can subsequently find the set of maximal elements in sub-O(n2) time.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top