Binary-ish search through partially ordered set

https://cs.stackexchange.com/questions/128133

29-09-2020
|

Question

I have an interesting function. It takes subsets of {1,...,N} to positive integers, i.e. $f:P([N]) \rightarrow Z^+$. I know that if S is a subset of S', $f(S) < f(S')$. Also, if S and S' have the same cardinality, the ordering induced by f is lexicographic, so for example $f(\{1,2,4\}) < f(\{1,3,4\})$. Given a value z, I'd like to find S such that $f(S) <= z$ and $f(S) <= f(T) <= z$ implies $f(S)=f(T)$ -- that is, I want to do a search on the lattice of subsets of [N].

If I knew the ordering were perfectly lexicographic, I'd use a simple binary search. I don't know that, and I believe it is not (e.g., $f(\{1,2,3,4,5,6\})$ is possibly greater than $f(\{7\})$). Is there a good O(N) algorithm to do this search on the poset? Obviously for N of any appreciable size I have to compute f on-the-fly and can't rely on in-memory storage.

Clarification after a discussion in the comments: The particular $f$ I am dealing with is additive -- specifically, $f(S) = \sum_{k\in S} g(k) + f(\emptyset)$, with $g$ a monotonically increasing function. This may be easier than the general case (which is also interesting, but not my particular problem).

Solution

Here is a simple algorithm that runs in $O(N^2)$ time and $O(N)$ space, assuming that $f(\emptyset)$, $f(\{1\})$, $f(\{2\})$, $\cdots$, $f(\{N\})$ are given in an array.

The starting idea is about the same as what has been given by the OP in his comment. "We will search on subsets of size K using the lexicographic order, for each $K$ from $0$ to $N$. Retain the one with the best value of $f$."

The problem is then how to search the best value of $f$ on subsets of size $K$, named $b_K$, in $O(N)$ time. Instead of binary search, we will check whether $N$, $N-1$, \cdots, $1$ should be included in the best subset one by one, by taking the real advantage of the lexicographic order on subsets.

Initialize $b_K = f(\emptyset)$. $\ b_K$ will be the best value on subsets of size $K$ at the end of this procedure.
Initialize $count = 0.$ $\ count$ is the number of elements that we have included in the best subset so far.
Check $f(\{N\})$. If $b_K + f(\{N\}) + f(\{1, 2, \cdots, K-count -1\})\le z$, $N$ must be included. Add $f(\{N\})$ to $b_K$ and add 1 to $count$.
Check $f(\{N-1\})$. If $b_K + f(\{N-1\}) + f(\{1, 2, \cdots, K-count-1\})\le z$, $N-1$ must be include. Add $f(\{N-1\})$ to $b_K$ and add 1 to $count$.
And so on.
Until either we have checked $f(\{1\})$ or $count == K$.

We might wonder, if it will take $O(N)$ to compute each $f(\{1,2, \cdots, K-count-1\})$, computing each $b_K$ alone will take $O(N * N)$ time. However, since $f$ is additive, we can compute all prefix sums of $f(\{1\})$, $f(\{2\})$, $\cdots$, $f(\{N\})$ upfront in $O(N)$ time. Then it takes $O(1)$ to access each prefix sum.

Since searching $b_K$ takes $O(N)$ time, for each $K$ from $0$ to $N$, the total running time is $O(N^2)$.

The description above of the algorithm skips the easiest case when $f(\emptyset)\gt z$. In that case, the algorithm should return that there is no such subset.

Licensed under: CC-BY-SA with attribution

Not affiliated with cs.stackexchange