Minimum Subsequence Sum Algorithm Verification

https://cs.stackexchange.com/questions/125166

29-09-2020
|

Pergunta

I have an algorithm which is meant to solve the following computational problem:

Input: Sequence of positive integers
Output: A Sub-sequence of some desired length derived from original Sequence such that the sum of the sub-sequence's elements are the minimum sum possible

The Algorithm I have to solve this problem is implemented as a method in Java:

public int[] runA(int k) {

        //k is desired length of sub-sequence

        /*Sequence S is the original sequence provided as input, it is available as a class variable
        for the class 'MinimumSubsequence' which this method 'runA' is apart of*/                       

        //A is Sequence S copy with pseudo-boolean values attached to each element in S
        int[][] A = new int[this.S.length][2];
        //B is size K array storing the k smallest elements found in S along with their indexes
        int[][] B = new int[k][2];

        //initialization
        for (int i = 0; i < this.S.length; i++) {
            A[i][0] = this.S[i];
            A[i][1] = False;
        }
        for (int i = 0; i < k; i++) {
            B[i][0] = this.S[i];
            B[i][1] = i;
        }

        //Execution
        //search for k smallest elements in sequence S
        for (int i = 0; i < k; i++) {
            for (int j = 0; j < this.S.length; j++) {
                if (A[j][0] <= B[i][0] && A[j][1] == False) {
                    B[i][0] = A[j][0];
                    B[i][1] = j;
                }
                A[(B[i][1])][1] = True;
            }
        }

        //build subsequence
        int[][] C = new int[this.S.length][2];
        for (int i = 0; i < C.length; i++) {
            C[i][1] = False;
        }
        for (int i = 0; i < C.length; i++) {
            for (int j = 0; j < B.length; j++) {
                if (B[j][1] == i && (C[i][1] == False)) {
                    C[i][0] = B[j][0];
                    C[i][1] = True;
                }
            }
        }
        int[] D = new int[k];
        int j = 0;
        for (int i = 0; i < C.length; i++) {
            if (C[i][1] == True) {
                D[j] = C[i][0];
                j++;
            }
        }

        return D;
    }

The algorithm basically works by scanning the original sequence provided, each time scanning for the next smallest number. Once it has found the k smallest numbers in the sequence it builds and returns a sub-sequence made from those smallest elements in the original sequence

I believe that this algorithm does correctly solve the computational problem and that the running time of this algorithm is in O(N) (N being size of input sequence). I would just like some verification about the correctness and efficiency of this algorithm. I am also wondering if there exists are more efficient algorithm to solve this computational problem as I'm just not very satisfied with this one but can think of no other approach.

Solução

The fact that your algorithm is in Java or the source itself are not really relevant. That said, your algorithm requires time $\Theta(n \cdot k)$, where $n$ is the length of the input sequence, and $k$ is the length of the desired subsequence. This can be as bad as $\Theta(n^2)$, when $k=\Theta(n)$.

You can improve the time complexity of your algorithm to $O(n)$ by first finding the $k$-th smallest element $x$ in $O(n)$ time, and then selecting all elements smaller than $x$, plus the necessary number of copies of $x$ (there might be multiple copies of $x$ in the input sequence and not necessarily all of them belong to the output).

An easy selection algorithm is quickselect with randomized pivot selection. This algorithm has complexity $O(n)$ with high probability. If you want a $O(n)$ worst-case complexity you'll have to be a bit more careful with the pivot selection. You can start by reading the Wikipedia pages on selection, quickselect, and median of medians.

If you want to be lazy and you don't care about losing a $O(\log n)$-factor over the optimal time complexity, then you can simply sort the input sequence in time $O(n \log n)$ (e.g., using Mergesort) and then return the first $k$ elements.

Another solution that is easy to implement if you have access to priority queues and has complexity $O(n \log k)$ is iterating over the input sequence while maintaining a max-Heap $H$ of the smallest $k$ elements encountered so far. Initially $H$ is empty and, for each element $x$ of the input you: (i) push $x$ into $H$, (ii) if $H$ contains $k+1$ elements, you extract one element from $H$. After examining all elements, $H$ contains exactly the $k$ smallest elements of the whole input sequence.

Licenciado em: CC-BY-SA com atribuição

Não afiliado a cs.stackexchange