Subarrays allocation [closed]

Question

You can solve this problem in O(NM) time by using a union find data structure. Follow these steps:

Step1: Generate continuous horizontal lines in the matrix for adjacent filled elements. And label these lines with different numbers.

For this altered matrix

0 х х х 0 х х х 0 0 0 0 0 0
0 x х х х x х х 0 0 0 х х х
0 0 х х х х 0 0 0 0 0 х х х
0 0 0 0 0 0 0 0 0 0 х х х х
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 x x x x x 0 0 0
0 0 0 0 0 x x x x 0 0 0 0 0

The different lines of the matrix will become:

0 1 1 1 0 2 2 2 0 0 0 0 0 0
0 3 3 3 3 3 3 3 0 0 0 4 4 4
0 0 5 5 5 5 0 0 0 0 0 6 6 6
0 0 0 0 0 0 0 0 0 0 7 7 7 7
0 0 0 0 0 0 0 0 0 0 0 0 0 0
0 0 0 0 0 0 8 8 8 8 8 0 0 0
0 0 0 0 0 9 9 9 9 0 0 0 0 0

Step2: Now initialize a Union Find data structure with all the different lines as separate clusters/sets in the data structure.

Step3: Start iterating over all the columns of the matrix. Whenever you encounter the beginning of a new line, check if there is some line exactly above or below that line. If there exists such a line above or below, then union the two sets that correspond to the adjacent lines in the data-structure.

Step4: Now that you have all the different clusters of lines at the end of the algorithm, you can easily calculate the boundaries of the clusters. And hence you have all the different sub arrays.

Running time: ~~O(NM)~~ considering that we use a weighted quick union-find data structure with path compression providing O(α(n)) time (where α is the inverse Ackermann function) for union operations. And since the inverse ackermann function grows very slowly, I am treating it as a constant.

Edit: The algorithm described will fail for this case (as told by user beaker in the comments)

1 1 1 1 1
2 0 0 0 3
0 0 4 0 0

To handle these types of cases, apart from the mentioned data structures, we will maintain an array that will contain the occupied rectangles of each row. So when we merge line 2 with line 1, the value for the 2nd row will become 0...4 indicating that there is a rectangle that covers indices 0 to 4 of the second row. Now when line 4 is considered in step 3 it will check if the points above or below are covered by some rectangle or not. And hence it will merge line 4's cluster with line 2's cluster.

The other parts of the algorithm will remain the same except that we now check if the adjacent points are covered by some rectangle or not.

The extra information that we are maintaing for all the rows will increase the running time. Whenever we consider the matrix column wise in Step3, we may have to do O(n) extra work. So the running time rises to O(N^2 * M)