Question

I have 2d aray similar to this:

string[,] arr = { 
                    { "A", "A", "A", "A", "A", "A", "A", "D", "D", "D", "D", "D", "D", "D", "D" }, 
                    { "1", "1", "1", "1", "1", "1", "1", "0", "0", "0", "0", "0", "0", "0", "0" },
                    { "2", "2", "2", "2", "2", "2", "2", "00", "00", "00", "00", "00", "00", "00", "00" }  
                };

I am trying to get following result from above array:

A 1 2
A 1 2
A 1 2
A 1 2
A 1 2
A 1 2

Get all "A" from the array at length 0. Than get corrospoding values of it from other columns. This is big 2d array with over 6k values. But design is exactly same as described above. I have tried 2 ways so far:

1st method: using for loop to go through all the values:

var myList = new List<string>();
var arrLength = arr.GetLength(1)-1;
for (var i = 0; i < arrLength; i++)
{
    if (arr[0,i].Equals("A"))
        myList.Add(arr[0, i]);
    else
        continue;
    }
}

2nd method: creating list and than going through all values:

var dataList = new List<string>();
var list = Enumerable.Range(0, arr.GetLength(1))
                     .Select(i => arr[0, i])
                     .ToList();

var index = Enumerable.Range(0, arr.GetLength(1))
                      .Where(index => arr[0, index].Contains("A"))
                      .ToArray();
var sI = index[0];
var eI = index[index.Length - 1];
myList.AddRange(list.GetRange(sI, eI - sI));      

They both seem to be slow, not efficient enough. Is there any better way of doing this?

Was it helpful?

Solution 2

As "usr" said: back to the basics if you want raw performance. Also taking into account that the "A" values can start at an index > 0:

var startRow = -1; // "row" in the new array.
var endRow = -1;

var match = "D";

for (int i = 0; i < arr.GetLength(1); i++)
{
    if (startRow == -1 && arr[0,i] == match) startRow = i;
    if (startRow > -1 && arr[0,i] == match) endRow = i + 1;
}

var columns = arr.GetLength(0);
var transp = new String[endRow - startRow,columns]; // transposed array

for (int i = startRow; i < endRow; i++)
{
    for (int j = 0; j < columns; j++)
    {
        transp[i - startRow,j] = arr[j,i];
    }
}

Initializing the new array first (and then setting the "cell values) is the main performance boost.

OTHER TIPS

I like to approach these kinds of algorithms in a way that my code ends up being self-documenting. Usually, describing the algorithm with your code, and not bloating it with code features, tends to produce pretty good results.

var matchingValues =
    from index in Enumerable.Range(0, arr.GetLength(1))
    where arr[0, index] == "A"
    select Tuple.Create(arr[1, index], arr[2, index]);

Which corresponds to:

// find the tuples produced by
//     mapping along one length of an array with an index
//     filtering those items whose 0th item on the indexed dimension is A"
//     reducing index into the non-0th elements on the indexed dimension

This should parallelize extremely well, as long as you keep to the simple "map, filter, reduce" paradigm and refrain from introducing side-effects.

Edit:

In order to return an arbitrary collection of the columns associated with an "A", you can:

var targetValues = new int[] { 1, 2, 4, 10 };
var matchingValues =
    from index in Enumerable.Range(0, arr.GetLength(1))
    where arr[0, index] == "A"
    select targetValues.Select(x => arr[x, index]).ToArray();

To make it a complete collection, simply use:

var targetValues = Enumerable.Range(1, arr.GetLength(0) - 1).ToArray();
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top