Question

While there are a few cases where I'll write something using the method chains (especially if it's just one or two methods, like foo.Where(..).ToArray()), in many cases I prefer the LINQ query comprehension syntax instead ("query expressions" in the spec), so something like:

var query =
    from filePath in Directory.GetFiles(directoryPath)
    let fileName = Path.GetFileName(filePath)
    let baseFileName = fileName.Split(' ', '_').First()
    group filePath by baseFileName into fileGroup
    select new
    {
        BaseFileName = fileGroup.Key,
        Count = fileGroup.Count(),
    };

In some fairly sizable chunk of those, I need to take the resulting IEnumerable and eager-load it into a data structure (array, list, whatever). This usually means either:

  1. adding another local variable like var queryResult = query.ToArray(); or

  2. wrapping the query with parens and tagging on ToArray (or ToList or whatever).

var query = (
    from filePath in Directory.GetFiles(directoryPath)
    let fileName = Path.GetFileName(filePath)
    let baseFileName = fileName.Split(' ', '_').First()
    group filePath by baseFileName into fileGroup
    select new
    {
        BaseFileName = fileGroup.Key,
        Count = fileGroup.Count(),
    }
).ToArray();

I'm trying to find out what options others are either 1) already using or 2) could think of as feasible to have some additional "contextual keywords" added - just things that would transform to extension methods the same way the existing ones do, as if the LINQ keywords were 'natively' extensible :)

I realize that most likely this is going to mean either some sort of preprocessing (not sure what's out there in this realm for C#) or changing the compiler used to something like Nemerle (I think it would be an option, but not really sure?). I don't know enough about what Roslyn does/will support yet, so if someone knows whether it could allow someone to 'extend' C# like this, please chime in!

The ones I'd likely use most (although I'm sure there are many others, but just to get across the idea / what i'm hoping for):

ascount - transforms to Count()

int zFileCount =
    from filePath in Directory.GetFiles(directoryPath)
    where filePath.StartsWith("z")
    select filePath ascount;

This would "transform" (doesn't matter what the path is, as long as the end result is) into:

int zFileCount = (
    from filePath in Directory.GetFiles(directoryPath)
    where filePath.StartsWith("z")
    select filePath
).Count();

Similarly:

  • asarray - transforms to ToArray()
  • aslist - transforms to ToList()

(you could obviously keep going for First(), Single(), Any(), etc, but trying to keep question scope in check :)

I'm only interested in the extension methods that don't need parameters passed. I'm not looking for trying to do this kind of thing with (for instance) ToDictionary or ToLookup. :)

So, in summary:

  • want to add 'ascount', 'aslist', and 'asarray' into linq query expressions
  • don't know if this has already been solved
  • don't know if Nemerle is a good choice for this
  • don't know if the Roslyn story would support this kind of use
Was it helpful?

Solution

Not an answer to your question, but rather some musings on your design. We strongly considered adding such a feature to C# 4 but cut it because we did not have the time and resources available.

The problem with the query comprehension syntax is, as you note, that it is ugly to mix the "fluent" and "comprehension" syntaxes. You want to know how many different last names your customers have in London and you end up writing this ugly thing with parentheses:

d = (from c in customers 
     where c.City == "London" 
     select c.LastName)
    .Distinct()
    .Count();

Yuck.

We considered adding a new contextual keyword to the comprehension syntax. Let's say for the sake of argument that the keyword is "with". You could then say:

d = from c in customers 
    where c.City == "London" 
    select c.LastName
    with Distinct() 
    with Count();

and the query comprehension rewriter would rewrite that into the appropriate fluent syntax.

I really like this feature but it did not make the cut for C# 4 or 5. It would be nice to get it into a hypothetical future version of the language.

As always, Eric's musing about hypothetical features of unannounced products that might never exist are for entertainment purposes only.

OTHER TIPS

On idea is that you could write your own query provider that wraps the version in System.Linq and then calls ToArray in its Select method. Then you would just have a using YourNamespace; instead of using System.Linq.

Roslyn does not allow you to extend the syntax of C#, but you can write a SyntaxRewriter that changes the semantics of a C# program as a rebuild step.

As others said, Roslyn is not what you probably think it is. It can't be used to extend C#.

All of the following code should be considered more brainstorming and less recommendation. It changes how LINQ behaves in unexpected ways and you should think really hard before using anything like it.

One way to solve this would be to modify the select clause like this:

int count = from i in Enumerable.Range(0, 10)
            where i % 2 == 0
            select new Count();

The implementation could look like this:

public  class Count
{}

public static class LinqExtensions
{
    public static int Select<T>(
        this IEnumerable<T> source, Func<T, Count> selector)
    {
        return source.Count();
    }
}

If you put anything that isn't Count in the select, it would behave as usual.

Doing something similar for arrays would take more work, since you need the select to specify both that you want an array and the selector of items you want in there, but it's doable. Or you could use two selects: one chooses the item and the other says you want an array.

Another option (similar to Kevin's suggestion) would be to have extension method like AsCount() which you could use like this:

int count = from i in Enumerable.Range(0, 10).AsCount()
            where i % 2 == 0
            select i;

You could implement it like this:

public static class LinqExtensions
{
    public static Countable<T> AsCount<T>(this IEnumerable<T> source)
    {
        return new Countable<T>(source);
    }
}

public class Countable<T>
{
    private readonly IEnumerable<T> m_source;

    public Countable(IEnumerable<T> source)
    {
        m_source = source;
    }

    public Countable<T> Where(Func<T, bool> predicate)
    {
        return new Countable<T>(m_source.Where(predicate));
    }

    public Countable<TResult> Select<TResult>(Func<T, TResult> selector)
    {
        return new Countable<TResult>(m_source.Select(selector));
    }

    // other LINQ methods

    public static implicit operator int(Countable<T> countable)
    {
        return countable.m_source.Count();
    }
}

I'm not sure I like it this way. Especially the implicit cast feels wrong, but I think there is no other way.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top