Question

Which of theses scenarios would be faster?

Scenario 1:

foreach (var file in directory.GetFiles())
{
    if (file.Extension.ToLower() != ".txt" &&
        file.Extension.ToLower() != ".bin")
        continue;

    // Do something cool.
}

Scenario 2:

var files = from file in directory.GetFiles()
                where file.Extension.ToLower() == ".txt" ||
                      file.Extension.ToLower() == ".bin"
                select file;

foreach (var file in files)
{
     // Do something cool.
} 

I know that they are logically the same because of delayed execution, but which would be the faster? And why?

Was it helpful?

Solution

Faster isn't usually the issue per se, especially in a scenario like this where there is not going to be a meaningful performance difference (and in general, if the code is not a bottleneck it just doesn't matter). The issue is which is more readable and more clearly expresses the intent of the code.

I think the second block of code more clearly expresses the intent of the code. It reads as "query a collection of file names for some file names with a certain property" and then "for each of those file names with that property, do something." It declares what is happening, rather than how it is going to happen. Separating the what from the mechanism is what makes the second block of code clearer and where LINQ really shines. Use LINQ to declare the what, and let LINQ implement the mechanism instead of in the past where the what would be muddled with the mechanism.

Is LINQ faster or just more convenient?

So, to answer the question in your title, LINQ usually does not materially hinder performance but it makes code more clear by allowing the coder to declare what they want done instead of having to focus on how they want something done. At the end of the day, we don't care about the how, we care about the what.

I know that they are logically the same because of delayed execution, but which would be the faster?

Probably the imperative version because there is a tiny amount of overhead in using LINQ. But if you really must know which is faster be sure to use a profiler, and be sure to test on real-world data.

And why?

Because LINQ adds a little bit of overhead. But the trade off is significantly clearer and more maintainable code. That is a huge win compared to the usually irrelevant performance loss.

OTHER TIPS

It would be faster to do a GetFiles("*.txt") and GetFile("*.bin") if the directory contains lots of files or is on a network drive.

Compared to that the extra overhead for LINQ is just noise.

Linq isn't faster and it's not really about convenience. Rather, Linq pulls the higher-order functions Fold, Map, and Filter into .NET (with different names). These functions are valuable because they allow us to DRY-up our code. Every time you set up an iteration with a secondary collection or result, you open yourself up to a bug. Linq allows you to focus on what happens inside the iteration and feel fairly confident that the iteration mechanics are bug-free.

This doesn't mean that Linq is strictly slower than manual iteration. As others have mentioned, you'll have to benchmark case-by-case.

I wrote an article on Code Project that benchmarked linq and Stored procedures as well as using compiled linq.

Please take a look.

http://www.codeproject.com/KB/cs/linqsql2.aspx

I understand you are looking at local file parsing, the article will give you some idea of what is involved and what linq is doing behind the scenes.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top