Question

I have two employee lists that I want to get only unique records from but this has a twist to it. Each list has an Employee class in it:

public class  Employee
{

// I want to completely ignore ID in the comparison
public int ID{ get; set; }
// I want to use FirstName and LastName in comparison
public string FirstName{ get; set; }
public string LastName{ get; set; }
}

The only properties I want to compare on for a match are FirstName and LastName. I want to completely ignore ID in the comparison. The allFulltimeEmployees list has 3 employees in it and the allParttimeEmployees list has 3 employees in it. The first name and last name match on two items in the lists - Sally Jones and Fred Jackson. There is one item in the list that does not match because FirstName is the same, but LastName differs:

emp.id = null; // not populated or used in comparison
emp.FirstName = "Joe"; // same
emp.LastName = "Smith"; // different

allFulltimeEmployees.Add(emp);

emp.id = 3; // not used in comparison
emp.FirstName = "Joe"; // a match
emp.LastName = "Williams"; // not a match - different last name

allParttimeEmployees.Add(emp);

So I want to ignore the ID property in the class during the comparison of the two lists. I want to flag Joe Williams as a non-match since the last names of Smith and Williams in the two lists do not match.

// finalResult should only have Joe Williams in it 

var finalResult = allFulltimeEmployees.Except(allParttimeEmployees);

I've tried using an IEqualityComparer but it doesn't work since it is using a single Employee class in the parameters rather than an IEnumerable list:

public class EmployeeEqualityComparer : IEqualityComparer<Employee>
    {
        public bool Equals(Employee x, Employee y)
        {
            if (x.FirstName == y.FirstName && x.LastName == y.LastName)
            {
                return true;
            }
            else
            {
                return false;
            }
        }

        public int GetHashCode(Employee obj)
        {
            return obj.GetHashCode();
        }
    }

How can I successfully do what I want and perform this operation? Thanks for any help!

Was it helpful?

Solution

Your idea of using the IEqualityComparer is fine, it's your execution that is wrong. Notably, your GetHashCode method.

public int GetHashCode(Employee obj) 
{ 
    return obj.GetHashCode(); 
} 

IEqualityComparer defines both Equals and GetHashCode because both are important. Do not ignore GetHashCode when you implement this interface! It plays a pivotal role on equality comparisons. No, it is not an indication that two items are equal, but it is an indicator that two elements are not. Two equal elements must return the same hash code. If they do not, they cannot be considered equal. If they do, then they might be equal, and equality functions only then go on to explore Equals.

With your implementation delegating to the GetHashCode method of the actual employee object, you are relying upon the implementation that Employee class uses. Only if that implementation is overriden will it be useful for you, and only if it is using your key fields. And if it is, then it is very likely that you did not need to define your own external comparer in the first place!

Build a GetHashCode method that factors in your key fields and you will be set.

public int GetHashCode(Employee obj)
{
     // null handling omitted for brevity, but you will want to
     // handle null values appropriately

     return obj.FirstName.GetHashCode() * 117 
          + obj.LastName.GetHashCode(); 
}

Once you have this method in place, then use the comparer in your call to Except.

var comparer = new EmployeeEqualityComparer();
var results = allFulltimeEmployees.Except(allParttimeEmployees, comparer);

OTHER TIPS

You can override Equals and GetHashCode in your Employees class.

For example,

    public class Employee
    {

        // I want to completely ignore ID in the comparison
        public int ID { get; set; }
        // I want to use FirstName and LastName in comparison
        public string FirstName { get; set; }
        public string LastName { get; set; }

        public override bool Equals(object obj)
        {
            var other = obj as Employee;
            return this.FirstName == other.FirstName && this.LastName == other.LastName;
        }

        public override int GetHashCode()
        {
            return this.FirstName.GetHashCode() ^ this.LastName.GetHashCode();
        }
    }

I tested with the following data set:

var empList1 = new List<Employee>
{
    new Employee{ID = 1, FirstName = "D", LastName = "M"}, 
    new Employee{ID = 2, FirstName = "Foo", LastName = "Bar"}
};
var empList2 = new List<Employee> 
{ 
    new Employee { ID = 2, FirstName = "D", LastName = "M" }, 
    new Employee { ID = 1, FirstName = "Foo", LastName = "Baz" } 
};

var result = empList1.Except(empList2); // Contained "Foo Bar", ID #2.

your IEqualityComparer should work:

var finalResult = allFulltimeEmployees.Except(allParttimeEmployees, new EmployeeEqualityComparer());

Try implementing the IEquatable(T) interface for your Employee class. You simply need to provide an implementation for an Equals() method, which you can define however you want (i.e. ignoring employee IDs).

The IEquatable interface is used by generic collection objects such as Dictionary, List, and LinkedList when testing for equality in such methods as Contains, IndexOf, LastIndexOf, and Remove. It should be implemented for any object that might be stored in a generic collection.

Example implementation of the Equals() method:

public bool Equals(Employee other)
{
   return (other != null) && (FirstName == other.FirstName) && (LastName == other.LastName);
}

It's not the most elegant solution, but you could make a function like so

public string GetKey(Employee emp)
{
    return string.Format("{0}#{1}", emp.FirstName, emp.LastName)
}

and then populate everything in allFullTimeEmployees into a Dictionary<string, Employee> where the key of the dictionary is the result of calling GetKey on each employee object. Then you could loop over allParttimeEmployees and call GetKey on each of those, probing into the dictionary (e.g. using TryGetValue or ContainsKey), and taking whatever action was necessary on a duplicate, such as removing the duplicate from the dictionary.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top