Question

This will probably be an extremely simple question. I'm simply trying to remove duplicate byte[]s from a collection.

Since the default behaviour is to compare references, I tought that creating an IEqualityComparer would work, but it doesn't.

I've tried using a HashSet and LINQ's Distinct().

Sample code:

using System;
using System.Collections.Generic;
using System.Linq;

namespace cstest
{
    class Program
    {
        static void Main(string[] args)
        {
            var l = new List<byte[]>();
            l.Add(new byte[] { 5, 6, 7 });
            l.Add(new byte[] { 5, 6, 7 });
            Console.WriteLine(l.Distinct(new ByteArrayEqualityComparer()).Count());
            Console.ReadKey();
        }
    }

    class ByteArrayEqualityComparer : IEqualityComparer<byte[]>
    {
        public bool Equals(byte[] x, byte[] y)
        {
            return x.SequenceEqual(y);
        }

        public int GetHashCode(byte[] obj)
        {
            return obj.GetHashCode();
        }
    }
}

Output:

2
Was it helpful?

Solution

The GetHashCode will be used by Distinct, and won't work "as is"; try something like:

int result = 13 * obj.Length;
for(int i = 0 ; i < obj.Length ; i++) {
    result = (17 * result) + obj[i];
}
return result;

which should provide the necessary equality conditions for hash-codes.

Personally, I would also unroll the equality test for performance:

if(ReferenceEquals(x,y)) return true;
if(x == null || y == null) return false;
if(x.Length != y.Length) return false;
for(int i = 0 ; i < x.Length; i++) {
    if(x[i] != y[i]) return false;
}
return true;
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top