문제

I have 3 groups of items(names). Group 1 has around 2,1k items, group 2 - around 7,6k and group 3 around 21k. I need to search in these groups. I need a hint which is better. I thought either to put all in a bin tree:

 GTree* t = g_tree_new((GCompareFunc)g_ascii_strcasecmp);
 and search like this:
 goup =  g_tree_lookup(t, (gpointer *)itemName);

Or would it be more efficient to make 3 arrays of strings:

char g1[2300][14];
char g2[8000][14];
char g3[78000][14];

And search like this (not checked, pseudocode):

    int isvalueinarray(char val,  char *g[][14];, int size){
        int i;
        for (i=0; i < size; i++) {
            if (memcmp(val, g[i], strlenth) == 0)
                return true;
        }
        return false;
    }
    int i group=0;
    if (isvalueinarray(itemName, g2, 7800) ) group = 2;
    if (isvalueinarray(itemName, g1, 2300) ) group = 1;

Or is there a better solution for that?

도움이 되었습니까?

해결책 2

If you want the most asymptotically efficient way of finding whether a string is in a set of strings which can be pre-processed, you can consider using a trie. The construction time is linear (O(L), where L is the sum of all the strings lengths in three sets), and the lookup time is linear on the size of a string which you are looking for (which is, in your case, 14).

A binary search tree would also be a nice option, giving you a logarithmic (on the size of the string sets) performance. This can be a bit slower, but probably easier to implement. Note that pre-processing (inserting all the strings of three sets into a tree) takes N * log(N) time, where N is the sum of set sizes.

Do not use linear search in an array, it's too slow.

다른 팁

The mathematically fastest would be to have everything in one tree and do a O(logN) binary search. Because Log(3N) is faster then 3 * Log(N). (N being the size of each array)

But in no case you would search using the pseudocode you wrote, you ALWAYS do binary search in a huge set of data. The one you wrote has complexity O(N) while a binary search has O(logN).

라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top