Can we compute this in less than O(n*n) …( nlogn or n)

https://stackoverflow.com/questions/3659506

01-10-2019
|

Pergunta

This is a question asked to me by a very very famous MNC. The question is as follows ...

Input an 2D N*N array of 0's and 1's. If A(i,j) = 1, then all the values corresponding to the ith row and the jth column are going to be 1. If there is a 1 already, it remains as a 1.

As an example , if we have the array

we should get the output as

The input matrix is sparsely populated.

Is this possible in less than O(N^2)?

No additional space is provided was another condition. I would like to know if there's a way to achieve the complexity using a space <= O(N).

P.S : I don't need answers that give me a complexity of O(N*N). This is not a homework problem. I have tried much and couldn't get a proper solution and thought I could get some ideas here.Leave the printing aside for the complexity

My rough idea was to may be dynamically eliminate the number of elements traversed restricting them to around 2N or so. But I couldn't get a proper idea.

Solução 7

Hii guys ,

thanks to the comment from mb14 i think i could get it solved in less than O(NN) time... The worst would take O(NN)...

Actually , we have the given array suppose

Lets have 2 arrays of size N (this would be the worst case) ... One is dedicated for indexing rows and other columns... Put those with a[i][1] = 0 in one array and then a[1][j] =0 in another..

Then take those values only and check for the second row and colums...In this manner , we get the values of rows and colums where there are only 0;'s entirely...

The number of values in the row array gives number of 0's in the result array and the points a[row-array values][column array value] gives you those points ....

We could solve it in below O(NN) and worst is O(NN) ... As we can seee , the arrays ( of size N) diminishes ....

I did this for a few arrays and got the result for all of them ... :)

Please correct me if i am wrong anywhere...

Thanx for all your comments guys...You are all very helpful and i did learn quite a few things along the way ... :)

Outras dicas

In the worst case, you may need to toggle N * N - N bits from 0 to 1 to generate the output. It would seem you're pretty well stuck with O(N*N).

I would imagine that you can optimize it for the best case, but I'm tempted to say that your worst case is still O(N*N): Your worst case will be an array of all 0s, and you will have to examine every single element.

The optimization would involve skipping a row or column as soon as you found a "1" (I can provide details, but you said you don't care about O(N*N)", but unless you have metadata to indicate that an entire row/column is empty, or unless you have a SIMD-style way to check multiple fields at once (say, if every row is aligned by 4, and you can read 32 bits worth data, or if your data is in form of a bitmask), you will always have to deal with the problem of an all-zero array.

Clearly, nor the output matrix nor its negated version has to be sparse (take a matrix with half of the first row set to 1 and anything else to 0 to see), so time depends on what format you are allowed to use for the output. (I'm assuming the input is a list of elements or something equivalent, since otherwise you couldn't take advantage of the matrix being sparse.)

A simple solution for O(M+N) space and time (M is the number of ones in the input matrix): take two arrays of length N filled with ones, iterate through all ones in the input, and for each drop the X coordinate from the first array and the Y from the second one. The output is the two arrays, which clearly define the result matrix: its (X,Y) coordinate is 0 iff the X coordinate of the first array and the Y coordinate of the second are 0.

Update: depending on the language, you could use some trickery to return a normal 2D array by referencing the same row multiple times. For example in PHP:

// compute N-length arrays $X and $Y which have 1 at the column 
// and row positions which had no 1's in the input matrix
// this is O(M+N)
$result = array();
$row_one = array_fill(0,N,1);
for ($i=0; $i<N; $i++) {
    if ($Y[$i]) {
         $result[$i] = &$row_one;
    } else {
         $result[$i] = &$X;
    }
}
return $result;

Of course this is a normal array only as long as you don't try to write it.

Since every entry of the matrix has to be checked, your worst case is always going to be N*N.

With a small 2*N extra storage, you can perform the operation in O(N*N). Just create a mask for each row and another for each column - scan the array and update the masks as you go. Then scan again to populate the result matrix based on the masks.

If you're doing something where the input matrix is changing, you could store a count of non-zero entries for each row and column of the input (rather than a simple mask). Then when an entry in the input changes, you update the counts accordingly. At that point, I would drop the output matrix entirely and query the masks/counts directly rather than even maintaining the output matrix (which could also be updated as thing change in less than NN time if you really wanted to keep it around). So loading the initial matrix would still be O(NN) but updates could be much less.

The input matrix may be sparse, but unless you can get it in a sparse format (i.e. a list of (i,j) pairs that are initially set), just reading your input will consume Ω(n^2) time. Even with sparse input, it's easy to end up with O(n^2) output to write. As a cheat, if you were allowed to output a list of set rows and set columns, then you could get down to linear time. There's no magic to be had when your algorithm actually has to produce a result more substantial than 'yes' or 'no'.

Mcdowella's comment on another answer suggests another alternative input format: run-length encoding. For a sparse input, that clearly requires no more than O(n) time to read it (consider how many transitions there are between 0 and 1). However, from there it breaks down. Consider an input matrix structured as follows:

0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 . . . 
0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 . . .
.     .
.       .
.         .

That is, alternating 0 and 1 on the first row, 0 everywhere else. Clearly sparse, since there are n/2 ones in total. However, the RLE output has to repeat this pattern in every row, leading to O(n^2) output.

You say:

we should get the output as...

So you need to output the entire matrix, which has N^2 elements. This is O(N*N).

The problem itself is not O(N*N): you dont have to compute and store the entire matrix: you only need two vectors, L and C, each of size N:

L[x] is 1 if line x is a line of ones, 0 otherwise;

C[x] is 1 if line x is a line of ones, 0 otherwise;

You can construct these vectors in O(N), because the initial matrix is sparse; your input data will not be a matrix, but a list containing the coordinates(line,column) of each non-zero element. While reading this list, you set L[line]=1 and C[column]=1, and the problem is solved: M[l,c] == 1 if L[l]==1 OR C[c]==1

There is clearly up to O(N^2) work to do. In the matrix

all bits have to be set to 1, and N*(N-1) are not set to one (20, in this 5x5 case).

Conversely, you can come up with an algorithm that always does it in O(N^2) time: sum along the top row and let column, and if the row or column gets a nonzero answer, fill in the entire row or column; then solve the smaller (N-1)x(N-1) problem.

So there exist cases that must take at least N^2 and any case can be solved in N^2 without extra space.

If your matrix is sparse, the complexity depends much on the input encoding and its in particular not well measured in N N² or something like that but in terms of N your input complexity Mⁱⁿ and your output complexity M^out. I'd expect something like O(N + Mⁱⁿ + M^out) but much depending on the encoding and the tricks that you can play with it.

That depends entirely of your input data structure. If you pass your matrix (1s and 0s) as a 2D array you need to traverse it and that is O(N^2). But as your data is sparse, if you only pass the 1's as input, you can do it so the ouptut is O(M), where M is not the number of cells but the number of 1 cells. It would be something similar to this (pseudocode below):

list f(list l) {
   list rows_1;
   list cols_1;

    for each elem in l {
        rows_1[elem.row] = 1;
        cols_1[elem.col] = 1;
    }

    list result;
    for each row in rows_1 {
        for each col in cols_1 {
             if (row == 1 || col == 1) {
                 add(result, new_elem(row, col));
             }
        }
    } 
   return result;
}

Don't fill the center of the matrix when you're checking values. As you go through the elements, when you have 1 set the corresponding element in the first row and the first column. Then go back and fill down and across.

edit: Actually, this is the same as Andy's.

It depends on your data structure.

There are only two possible cases for rows:

A row i is filled with 1's if there is an element (i,_) in the input
All other rows are the same: i.e. the j-th element is 1 iff there is an element (_,j) in the input.

Hence the result could be represented compactly as an array of references to rows. Since we only need two rows the result would also only consume O(N) memory. As an example this could be implemented in python as follows:

def f(element_list, N):
  A = [1]*N
  B = [0]*N
  M = [B]*N
  for row, col in element_list:
    M[row] = A
    B[col] = 1
  return M

A sample call would be

 f([(1,1),(2,2),(4,3)],5)

with the result

[[0, 1, 1, 1, 0], [1, 1, 1, 1, 1], [1, 1, 1, 1, 1], [0, 1, 1, 1, 0], [1, 1, 1, 1, 1]]

The important point is that the arrays are not copied here, i.e. M[row]=A is just an assignment of a reference. Hence the complexity is O(N+M), where M is the length of the input.

#include<stdio.h>

include

int main() { int arr[5][5] = { {1,0,0,0,0}, {0,1,1,0,0}, {0,0,0,0,0}, {1,0,0,1,0}, {0,0,0,0,0} }; int var1=0,var2=0,i,j;

for(i=0;i<5;i++)
   var1 = var1 | arr[0][i];

for(i=0;i<5;i++)
   var2 = var2 | arr[i][0];

for(i=1;i<5;i++)
   for(j=1;j<5;j++)
      if(arr[i][j])
         arr[i][0] = arr[0][j] = 1;

for(i=1;i<5;i++)
   for(j=1;j<5;j++)
          arr[i][j] = arr[i][0] | arr[0][j];

for(i=0;i<5;i++)
   arr[0][i] = var1;

for(i=0;i<5;i++)
   arr[i][0] = var2;

for(i=0;i<5;i++)
{
   printf("\n");             
   for(j=0;j<5;j++)
      printf("%d ",arr[i][j]);
}

getch();

}

This program makes use of only 2 4 temporary variables (var1,var2,i and j) and hence runs in constant space with time complexity O(n^2).. I Think it is not possible at all to solve this problem in < O(n^2).

Licenciado em: CC-BY-SA com atribuição

Não afiliado a StackOverflow