سؤال

There are multiple related questions, but I'm looking for a solution specific to my case. There is an array of (usually) 14 integers, each in the range of 1 to 34. How can I quickly tell if each int in a specific, static list appears at least once in this array?

For reference, I'm currently using this code, which was written to resemble the spec as closely as possible, so it can certainly be improved vastly:

if (array.Count < 13) {
    return;
}

var required = new int[] {
    0*9 + 1,
    0*9 + 9,
    1*9 + 1,
    1*9 + 9,
    2*9 + 1,
    2*9 + 9,                
    3*9 + 1,
    3*9 + 2,
    3*9 + 3,
    3*9 + 4,
    3*9 + 5,
    3*9 + 6,
    3*9 + 7,
};

IsThirteenOrphans = !required.Except (array).Any ();

The required list is not dynamic, i.e. it will always be the same during runtime. Using Linq is optional, the main aspect is performance.

Edit:

  • The input array is not sorted.
  • Input values may appear multiple times.
  • The input array will contain at least 14 items, i.e. 1 more than the required array.
  • There is only 1 required array and it is static.
  • The values in required are distinct.
  • You may assume that a histogram is cheap to create.

Update: I am also interested in a solution for a sorted input array.

هل كانت مفيدة؟

المحلول

Idea 1
If you need to compare with several required lists then you might sort the input list and then simply compare it by iterating. But sorting is of course not too fast, but not too slow either. But if you compare with several required lists the overhead of sorting might be amortized quickly.

Once the array is sorted comparing is trivial:

for(int i = 0; i < 14; i++)
  if(arr[i] != required[i]) return false;

return true;

Idea 2
Or if the 14 integers are distinct/unique you can simply make required a HashSet and do

input.Count(i => required.Contains(i)) == 14

But I don't know without actually testing it if that's faster than sorting.

Idea 3
Calculate a quick hash which is invariant under permutations over the 14 ints and compare it to the known value of require. Only if the hash matches do the more expensive comparison.

//Prepare/precalculated
int[] hashes = new int[34];
Random random = new Random();
for(int i = 0; i < 34; i++)
  hashes[i] = random.NextInt();

//On each list
int HashInts(int[] ints)
{
  int result = 0;
  foreach(int i in ints)
    result += hashes[i - 1];

  return result;
}

A wise choice of values for hashes might improve it a bit, but random values should be fine.

Idea 4
Create a Histogram:

int[] CreateHistogram(int[] ints)
{
  int[] counts = new int[34];
  foreach(int i in ints)
  {
    counts[i - 1]++;
  }

  return counts;
}

You can avoid the array creating by reusing an existing array if performance is really important.

نصائح أخرى

If you have a 34+ bit integral type available, and C style bit operations, then you could compute such an integer V from the variable list (if the list is v[0], v[1], ... then V is (1 << v[0]) | (1 << v[1]) ... where 1 is of the same type as V) and have a predefined such integer S for the static list, computed analogously. The test to see if the static list is contained in the variable list is then (S & ~V) == 0.

One possibility is to change the way that you store the data. Since the range of possible values is limited to 1-34, instead of storing a list of numbers, you could store the counts of each number present:

int[] counts = new int[34];

If your list has one 1 and two 3s, then counts[0] == 1 && counts[2] = 2 (you could alternately use 1-based indexing if that makes things faster (fewer subtractions.))

Now to evaluate that each int in a list appears at least once, you simply index into the array sequentially for each x and verify that all counts[x] > 0. There will be overhead associated with transforming the data from the counts form to the list form, but that's only a problem if you also frequently need to view the data in the list form. Another advantage to this storage format is that adding/removing to counts will never involve more than a single array element; in the list form, removing an element in the middle of the list entails a copy of multiple elements.

If you want fast way you shouldn't use linq, If a given list items all are bellow 35 you can remove if (lst[i] < 35) The bellow answer traverse list at most one time, and acts like counting sort:

public bool FindExactMatch(int[] array, List<int> lst)
{
    bool[] a34 = new bool[35];

    foreach(var item in array)
    {
        a34[item] = true;
    }

    int exact = 0;

    for (int i = 0; i < lst.Count; i++)
    {
        if (a34[lst[i]])
        {
            exact++;
            if (exact == array.Length) return true;

            a34[lst[i]] = false;
        }
    }

    return false;
}

for sorted list if list size is big you can do lst.BinarySearch(array[i]) which takes 14 * log(n) * c1 at most, and I think it's efficient enough also if you implement it may be become faster, and I didn't test binary search with my own implementation, but Min, Max, Sort in linq is slower than (between 4 and 10 time) your own (good) implementation. And if sorted list size is small i prefer to use algorithm like above because constant c1 is small in above algorithm and may be larger in binary search algorithm.

This looks like a good candidate for bitwise operations, since the values in required are distinct, static, and between 1 and 34. Instead of saving required as an array, save it as a const ulong. In your arrays to check, create a new ulong populated by left-shift each value and bitwise or.

const ulong comparator = (1UL << 1) | (1UL << 9) | (1UL << 10) | (1UL << 18) | (1UL << 19) | (1UL << 27) | (1UL << 28) | (1UL << 29) | (1UL << 30) | (1UL << 31) | (1UL << 32) | (1UL << 33) | (1UL << 34);

public static bool ContainsDistinct13Values(int[] array)
{
    ulong word = 0;
    foreach (int i in array)
    {
        word |= (1UL << i);
    }
    return word == comparator;
}

Edit: So I understood your question and probably have some nice over-complicated solution. Another question is, how good performance it has.

static void Main(string[] args)
{
    var required = new int[]
                       {
                           0*9 + 1,
                           0*9 + 9,
                           1*9 + 1,
                           1*9 + 9,
                           2*9 + 1,
                           2*9 + 9,
                           3*9 + 1,
                           3*9 + 2,
                           3*9 + 3,
                           3*9 + 4,
                           3*9 + 5,
                           3*9 + 6,
                           3*9 + 7,
                       };

    precomputed = required.Select((x, i) => new { Value = x, Offset = (UInt16)(1 << i) }).ToDictionary(x => x.Value, x => x.Offset);

    for (int i = 0; i < required.Length; i++)
    {
        precomputedResult |= (UInt16)(1 << i);
    }

    int[] array = new int[34]; // your array goes here..
    Console.WriteLine(ContainsList(array));

    Console.ReadKey();
}

// precompute dictionary
private static Dictionary<int, UInt16> precomputed;
// precomputed result
private static UInt16 precomputedResult = 0;

public static bool ContainsList(int[] values)
{
    UInt16 result = 0;
    for (int i = 0; i < values.Length; i++)
    {
        UInt16 v;
        if (precomputed.TryGetValue(values[i], out v))
            result |= v;
    }

    return result == precomputedResult;
}

You could easily loop through the items and find out is there any item missing. By your example i understand you just wanted to know if there is any item from required is missing in array. so you could write

bool notContains = false;

foreach (var iddd in required)
{
    foreach (var ar in array)
    {
        if (iddd == ar) notContains=true;
    }

    if (notContains == false) break;
}

This is lot faster than your approach. Check out the below code with timer added. Your approach took 5 ms, but the new one take 0 ms

System.Diagnostics.Stopwatch aTimer = new System.Diagnostics.Stopwatch();
aTimer.Start();

var IsThirteenOrphans = !required.Except(array).Any();
aTimer.Stop();

System.Diagnostics.Stopwatch bTimer = new System.Diagnostics.Stopwatch();
bTimer.Start();
bool notContains = false;

foreach (var iddd in required)
{
    foreach (var ar in array)
    {
        if (iddd == ar) notContains=true;            
    }

    if (notContains == false) break;
}

bTimer.Stop();
مرخصة بموجب: CC-BY-SA مع الإسناد
لا تنتمي إلى StackOverflow
scroll top