Am I understanding Predicates correctly in regards to LINQ

https://stackoverflow.com/questions/11809304

24-06-2021
|

Question

I am trying to get my head around Predicates and LINQ.

While the Syntax of LINQ is starting to make sense to me, I am having slightly more trouble with the theory behind LINQ.

Here's what I have so far. When designing LINQ, instead of creating a new interface that defined each member that any object that it would be possible to query using LINQ would need to implement, Microsoft instead decided to take the existing class IEnumerable, and extend this class using extension methods.

I think I understand extension methods. An extension method is a static method inside a static class. The first parameter passed into this method is passed in with the this parameter, and defines the type that is being extended. Then, any instance of this type within the same namespace as the extension method can make use of the method.

So Microsoft created lots of Extension methods extending IEnumerable inside the System.LINQ namespace, and any class which uses the System.LINQ namespace and contains an object implementing IEnumerable can use these extension methods to query that object. Each of these extension methods takes a delegate as their second parameter.

In regards to where, where is an extension method extending IEnumerable and returning a new Object which implements IEnumerable. The next parameter where takes is a predicate (a method returning a boolean) of type Func (a generic func). This is a delegate, which returns true or false and can take up to 16 parameters. However, instead of having to write a method which meets this criteria, create an instance of the Func type and point this towards your method, and pass this variable into the where method, C# allows you to write this on the fly. Everything you put after the word where when constructing your LINQ query becomes your predicate.

Behind the scenes, the members of the object implementing IEnumerable are iterated through and evaluated against your predicate, and if true, are added to the new IEnumerable object using the yield return syntax.

Apologies if this seems a bit disjointed but I have basically dumped everything out of my brain as I understand it and was hoping that somebody who understands this a lot better than myself would come along and tell me what bits I have correct, which bits are wrong and generally expand on what I have written above as I am having a bit of trouble properly understanding what is going on here.

Solution

While I'm not entirely sure what you're after, I think you've got the right of it, for the most part. If I read your question correctly, there's really two things you're pondering here: extension methods, and predicates.

Here's what might help this sink in: you can basically implement the Where operator yourself, step by step, and see where all the pieces slot in. It will seem a lot less magical once you know what's under the hood.

Say we've got an array of Things, and we want to write a method to help us figure out which of those Things are awesome. Here's one way we could do it:

static IEnumerable<Thing> ThingsThatAreAwesome(IEnumerable<Thing> things){
    List<Thing> ret;
    foreach (Thing thing in things) {
        if (thing.IsAwesome)
            ret.Add(thing);
    }

    return ret;
}

Which we would then call like this:

List<Thing> myThings;
List<Thing> myAwesomeThings = ThingsThatAreAwesome(myThings);

So that's pretty keen. We're just iterating over our list of things, seeing which of them are awesome, and then returning the ones that meet our awesome criteria. But semantically it doesn't really do it for us - our awesome filter is so awesome that we want to be able to just walk up to a list of Things and call our operator on it, as if it were an instance method on IEnumerable itself.

And so that's where extension methods come in. Through some compiler trickery, they give us the ability to "extend" types. So like you said, by putting "this" in front of the IEnumerable parameter of our method, we can now walk up to our list of things and ask it to filter itself like this:

List<Thing> myAwesomeThings = myThings.ThingsThatAreAwesome();

So - that's where extension methods fit in. Next is the "predicate".

So we've got our magnificent, awesome filter, and it's great, but then we have a brain explosion: with a little bit of abstraction, that method we just wrote could be used to filter anything. Not just lists of Thing objects, and not just filtering on things that are awesome.

Making it work with any type is fairly easy, we just make it a generic operator with IEnumerable<T>, rather than IEnumerable<Thing> - but making it filter on any criteria is trickier. How is the method supposed to know how to filter any type? It obviously can't - our calling code will have to tell it exactly what we mean by "filter" - what kind of filter we want. And so we give it a second parameter, a function pointer, that expresses just what we're after. We'll call that our "predicate", which is just a way of saying a chunk of code that returns true or false. When all is said and done, it looks a little like this. We'll rename the method to "Filter", since that better expresses what we're going for now:

static IEnumerable<T> Filter(this IEnumerable<T> list, Func<T,bool> predicate) {
    foreach (T item in list) {
        if (predicate(item))
            yield return item;
    }
}

You can see that we're not really doing anything different from our Awesome filter method before - we're still just iterating over a list, performing some kind of check, and returning the items that pass that check. But we've given ourselves a way for calling code to express exactly what that "check" should be.

We're basically still only doing two things: iterating over a list, and running some kind of check over every item in that list - the items that pass the check get passed back. Except now the method doesn't really know what the check it's running looks like - we are telling it, passing that piece of code in as a parameter, our predicate, rather than hard coding it into the method itself. We leave it up to the caller to decide what they want their filtering criteria to be.

So at this point, we've basically got the LINQ Where operator - we can now run queries over any type of collection, all with the same method. If you've not messed around with lambdas yet, don't worry about it - just know it's a very succinct way of expressing a bit of code, which in this case is telling our filter method what we want to filter on:

List<Thing> myThings;
List<Cats>  myCats;

var myAwesomeThings = myThings.Filter(thing => thing.IsAwesome);
var myCrazyCats = myCats.Filter(cat => cat.IsCrazy);

foreach (var thing in myAwesomeThings){
    Console.WriteLine("This thing is awesome! {0}", thing);
}

foreach (var cat in myCrazyCats){
    Console.WriteLine("This cat is crazy! {0}", cat);
}

I hope that helps solidify some concepts - but if you really want to get down and dirty into LINQ, you'll want to out TekPub's Mastering LINQ screencast. It's an awesome step by step introduction. Gives you all the groundwork and then walks you through pretty much every operator. I can't recommend it enough.

OTHER TIPS

So Microsoft created lots of Extension methods extending IEnumerable inside the System.LINQ namespace, and any class which uses the System.LINQ namespace and contains an object implementing IEnumerable can use these extension methods to query that object. Each of these extension methods takes a delegate as their second parameter.

The two thing's I'd note are:

Not every method takes a delegate as a parameter - for example Skip(int count), Take (int count), Concat(IEnumerable second) etc.
System.Linq uses extension methods because it's just one implementation of LINQ. I can swap that out to use MongoDB.Driver.Linq instead if I'm doing a DB query etc.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow