Fastest way to get values from 2d array

Question 1

As "usr" said: back to the basics if you want raw performance. Also taking into account that the "A" values can start at an index > 0:

var startRow = -1; // "row" in the new array.
var endRow = -1;

var match = "D";

for (int i = 0; i < arr.GetLength(1); i++)
{
    if (startRow == -1 && arr[0,i] == match) startRow = i;
    if (startRow > -1 && arr[0,i] == match) endRow = i + 1;
}

var columns = arr.GetLength(0);
var transp = new String[endRow - startRow,columns]; // transposed array

for (int i = startRow; i < endRow; i++)
{
    for (int j = 0; j < columns; j++)
    {
        transp[i - startRow,j] = arr[j,i];
    }
}

Initializing the new array first (and then setting the "cell values) is the main performance boost.

Question 2

I like to approach these kinds of algorithms in a way that my code ends up being self-documenting. Usually, describing the algorithm with your code, and not bloating it with code features, tends to produce pretty good results.

var matchingValues =
    from index in Enumerable.Range(0, arr.GetLength(1))
    where arr[0, index] == "A"
    select Tuple.Create(arr[1, index], arr[2, index]);

Which corresponds to:

// find the tuples produced by
//     mapping along one length of an array with an index
//     filtering those items whose 0th item on the indexed dimension is A"
//     reducing index into the non-0th elements on the indexed dimension

This should parallelize extremely well, as long as you keep to the simple "map, filter, reduce" paradigm and refrain from introducing side-effects.

Edit:

In order to return an arbitrary collection of the columns associated with an "A", you can:

var targetValues = new int[] { 1, 2, 4, 10 };
var matchingValues =
    from index in Enumerable.Range(0, arr.GetLength(1))
    where arr[0, index] == "A"
    select targetValues.Select(x => arr[x, index]).ToArray();

To make it a complete collection, simply use:

var targetValues = Enumerable.Range(1, arr.GetLength(0) - 1).ToArray();