Pergunta

suppose I have a table like this: (table is a 2-d array in C++) number is the count for each row.

1 a b c
1 a b c
1 c d e
1 b c d
1 b c d

with be squeezed to:

2 a b c
1 c d e
2 b c d

My algorithm is O(n*n), can some one improve it?

suppose t1 is original one;
initial another t2;
row_num = 1;
copy first row of t1 to t2;

foreach row in t1 (1 to n)
    search each row in t2 (0 to row_num);
        if equal, then add the number;
            break;
    if not found, then copy current t1's row to t2;
        row_num++
Foi útil?

Solução 2

Here's a working example of O(N log N) complexity. It first sorts the data, then loops over each element and counts the number occurances by looking for the first mismatch, and then storing the sum of the counts from the current element in a result vector. Note that you can also have counts different from 1 in your initial arrays. The code works without having to specify a specific comparison function because std::array already has a lexicographic operator<.

The code below uses C++11 features (auto, lambda) that might not work on your compiler. You might also use initalizer lists to initialize the vector in one statement, but withthe nested vector of pair of int and array, I got a little confused on how many braces I needed to write :-)

#include <algorithm>
#include <array>
#include <iostream>
#include <utility>
#include <vector>

typedef std::pair<int, std::array<char, 3> > Element;
std::vector< Element > v;
std::vector< Element > result;

int main()
{
        v.push_back( Element(1, std::array<char, 3>{{'a', 'b', 'c'}}) );
        v.push_back( Element(2, std::array<char, 3>{{'a', 'b', 'c'}}) );
        v.push_back( Element(1, std::array<char, 3>{{'c', 'd', 'e'}}) );
        v.push_back( Element(1, std::array<char, 3>{{'b', 'c', 'd'}}) );
        v.push_back( Element(3, std::array<char, 3>{{'b', 'c', 'd'}}) );

        // O(N log(N) ) complexity
        std::sort(v.begin(), v.end(), [](Element const& e1, Element const& e2){
                // compare the array part of the pair<int, array>
                return e1.second < e2.second; 
        });

        // O(N) complexity
        for (auto it = v.begin(); it != v.end();) {
                // find next element
                auto last = std::find_if(it, v.end(), [=](Element const& elem){
                        return it->second != elem.second;     
                });

                // accumulate the counts
                auto count = std::accumulate(it, last, 0, [](int sub, Element const& elem) {
                    return sub + elem.first;
                });

                // store count in result
                result.push_back( Element(count, it->second) );                  
                it = last;
        }

        for (auto it = result.begin(); it != result.end(); ++it) {
                std::cout << it->first << " ";
                for (std::size_t i = 0; i < 3; ++i)
                        std::cout << it->second[i] << " ";
                std::cout << "\n";
        }
}

Output on Ideone

NOTE: the loop over the sorted elements might seem O(N^2) (a linear std::find_if nested inside a linear for), but it is O(N) because of the last loop statement it = last that skips over already searched elements.

Outras dicas

If your data's sorted like in the example, then it's just O(n).

Use a std::sort(or any other O(nlogn) sort) to order your arrays. Then it's just another pass and it's done :)

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top