Question

I'm trying to track down a bug in our code. I've boiled it down to the snippet below. In the example below I have a grid of ints (a list of rows), but I want to find the indexes of the columns that have a 1. The implementation of this is to create an enumerator for each row and step through each column in turn by keeping the enumerators in step.

class Program
{
    static void Main(string[] args)
    {
        var ints = new List<List<int>> {
            new List<int> {0, 0, 1},    // This row has a 1 at index 2
            new List<int> {0, 1, 0},    // This row has a 1 at index 1
            new List<int> {0, 0, 1}     // This row also has a 1 at index 2
        };
        var result = IndexesWhereThereIsOneInTheColumn(ints);

        Console.WriteLine(string.Join(", ", result)); // Expected: "1, 2"
        Console.ReadKey();
    }


    private static IEnumerable<int> IndexesWhereThereIsOneInTheColumn(
        IEnumerable<List<int>> myIntsGrid)
    {
        var enumerators = myIntsGrid.Select(c => c.GetEnumerator()).ToList();

        short i = 0;
        while (enumerators.All(e => e.MoveNext())) {
            if (enumerators.Any(e => e.Current == 1))
                yield return i;
            i++;

            if (i > 1000)
                throw new Exception("You have gone too far!!!");
        }
    }

}

However I have noticed that MoveNext() is not remembered each time around the while loop. MoveNext() always returns true, and Current is always 0. Is this a purposeful feature of Linq to make it more side effect free?

I noticed that this works:

    private static IEnumerable<int> IndexesWhereThereIsOneInTheColumn(
        IEnumerable<List<int>> myIntsGrid)
    {
        var enumerators = myIntsGrid.Select(c => 
            c.ToArray().GetEnumerator()).ToList(); // added ToArray() 

        short i = 0;
        while (enumerators.All(e => e.MoveNext())) {
            if (enumerators.Any(e => (int)e.Current == 1)) // added cast to int
                yield return i;
            i++;
        }
    }

So is this just a problem with List?

Was it helpful?

Solution 2

As Sriram Sakthivel's answer says the issue is due to lack of boxing and accidentally the list enumerator implementation being a struct, not a reference type. Usually, one would not expect the value-type behavior for an enumerator, as most are either exposed by the IEnumerator/IEnumerator<T> interfaces, or are reference types themselves. A quick way to go around this is to change this line

var enumerators = myIntsGrid.Select(c => c.GetEnumerator()).ToList();

to

var enumerators 
    = myIntsGrid.Select(c => (IEnumerator) c.GetEnumerator()).ToList();

instead.

The above code will construct a list of already boxed enumerators, which will be treated as reference type instances, because of the interface cast. From that moment on, they should behave as you expect them to in your later code.


If you need a generic enumerator (to avoid casts when latter using the enumerator.Current property), you can cast to the appropriate generic IEnumerator<T> interface:

c => (IEnumerator<int>) c.GetEnumerator()

or even better

c => c.GetEnumerator() as IEnumerator<int>

The as keyword is said to perform a lot better than direct casts, and in the case of a loop it could bring an essential performance benefit. Just be careful that as returns null if the cast fails As per Flater's request from comments:. In the OP's case, it is guaranteed the enumerator implements IEnumerator<int>, so it is safe to go for an as cast.

OTHER TIPS

It is because the enumerator of List<T> is a struct whereas the enumerator of Array is a class.

So when you call Enumerable.All with the struct, copy of enumerator is made and passed as a parameter to Func since structs are copied by value. So e.MoveNext is called on the copy, not the original.

Try this:

Console.WriteLine(new List<int>().GetEnumerator().GetType().IsValueType);
Console.WriteLine(new int[]{}.GetEnumerator().GetType().IsValueType);

It prints:

True
False

Alternatively, you could do it with a lambda extension

var ids = Enumerable.Range(0,ints.Max (row => row.Count)).
      Where(col => ints.Any(row => (row.Count>col)? row[col] == (1) : false));

or

var ids = Enumerable.Range(0,ints.Max (row=> row.Count)).
      Where(col => ints.Any (row => row.ElementAtOrDefault(col) == 1));

Here's a simple implementation using loops and yield:

private static IEnumerable<int> IndexesWhereThereIsOneInTheColumn(
    IEnumerable<List<int>> myIntsGrid)
{
    for (int i=0; myIntsGrid.Max(l=>l.Count) > i;i++)
    {
        foreach(var row in myIntsGrid)
        {
            if (row.Count > i && row[i] == 1)
            {
                yield return i;
                break;
            }
        }
    }       
}

Alternatively, use this inside the for loop:

if (myIntsGrid.Any(row => row.Count > i && row[i] == 1)) yield return i;

Just for fun, here's a neat LINQ query that won't cause hard-to-trace side effects in your code:

IEnumerable<int> IndexesWhereThereIsOneInTheColumn(IEnumerable<IEnumerable<int>> myIntsGrid)
{
    return myIntsGrid
        // Collapse the rows into a single row of the maximum value of all rows
        .Aggregate((acc, x) => acc.Zip(x, Math.Max))
        // Enumerate the row
        .Select((Value,Index) => new { Value, Index })
        .Where(x => x.Value == 1)
        .Select(x => x.Index);
}

Why can't you just get those indexes like this:

var result = ints.Select (i => i.IndexOf(1)).Distinct().OrderBy(i => i);

Seems to be much easier...

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top