Question

I'm trying to find a distinct List<Author> given a List<BlogPost> where each BlogPost has an Author property. I've found the Distinct() extension method in generics and I'm trying to use it. First, let me explain my loop and where I want to use it, then I'll explain my classes and where I'm having trouble.

Trying to use distinct here

public List<Author> GetAuthors() {

  List<BlogPost> posts = GetBlogPosts();
  var authors = new List<Author>();

  foreach (var bp in posts) {
    authors.Add(bp.Author);
  }

  return authors.Distinct().ToList();
}

Based on what I've read on MSDN, Distinct() either uses the default comparer or a passed in comparer. I was hoping (I obviosuly don't know if this is doable) to write a comparer in one spot and be able to use it for all of my classes since they all compare by the exact same equality operation (which compares the GUID property of each class).

All of my classes inherit from the BasePage class:

public class BasePage : System.Web.UI.Page, IBaseTemplate, IEquatable<IBaseTemplate>, IEqualityComparer<IBaseTemplate>

public class Author : BasePage

public class BlogPost : BasePage

My equals method implemented in BasePage compares the GUID property which is unique to each. When I call Distinct() on an Author it doesn't seem to work. Is there any way I can wrap up the comparer in one place and always be able to use it rather than having to write something like class AuhorComparer : IEqualityComparer<Auhor> since I'd then need to write the same thing for each class, every time I want to use Distinct(). Or can I override the default comparer somehow so I don't have to pass anything to Distinct()?

Was it helpful?

Solution

The Distinct operation is probably not the best solution here because you end up building a potentially very big list with duplicates only to then immediately shrink it to distinct elements. It's probably better to just start with a HashSet<Author> to avoid building up the large list.

public List<Author> GetAuthors() { 
  HashSet<Author> authorSet = new HashSet<Author>();
  foreach (var author in GetBlogPosts().Select(x => x.Author)) {
    authorSet.Add(author);
  }
  return authorSet.ToList();
}

If you do want to use Distinct then the best route is to implement IEquatable on the Author type. When not given an explicit IEqualityComparer the Distinct and other LINQ methods will eventually default into using the IEquatable implementation on the type. Usually through EqualityComprare<T>.Default

OTHER TIPS

Overriden Equals should work for you. One thing that might be going wrong is that GetHashCode is not overridden alongside Equals, which the framework guidelines dictate should happen.

The code only shows the main idea, which, I hope, will be useful.

public class Repository
{
    public List<Author> GetAuthors()
    {
        var authors = new List<Author>
                        {
                            new Author{Name = "Author 1"},
                            new Author{Name = "Author 2"},
                            new Author{Name = "Author 1"}
                        };
        return authors.Distinct(new CustomComparer<Author>()).ToList();
    }

    public List<BlogPost> GetBlogPosts()
    {
        var blogPosts = new List<BlogPost>
        {
            new BlogPost {Text = "Text 1"},
            new BlogPost {Text = "Text 2"},
            new BlogPost {Text = "Text 1"}
        };
        return blogPosts.Distinct(new CustomComparer<BlogPost>()).ToList();
    }
}

//This comparer is required only one.
public class CustomComparer<T> : IEqualityComparer<T> where T : class
{
    public bool Equals(T x, T y)
    {
        if (y == null && x == null)
        {
            return true;
        }
        if (y == null || x == null)
        {
            return false;
        }
        if (x is Author && y is Author)
        {
            return ((Author)(object)x).Name == ((Author)(object)y).Name;
        }
        if (x is BlogPost && y is BlogPost)
        {
            return ((BlogPost)(object)x).Text == ((BlogPost)(object)y).Text;
        }
        //for next class add comparing logic here
        return false;
    }

    public int GetHashCode(T obj)
    {
        return 0; // actual generating hash code should be here
    }
}

public class Author
{
    public string Name { get; set; }
}

public class BlogPost
{
    public string Text { get; set; }
}
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top