Question

In an int?[] garunteed to have original values, I wanted to find the index of null, which will only be a single index.

I drilled into the Array.IndexOf(T[] array, T value) .NET and, after all of the contracts and index checks, it comes down to this method:

internal virtual int IndexOf(T[] array, T value, int startIndex, int count){
    int endIndex = startIndex + count;
    for (int i = startIndex; i < endIndex; i++)
        if (Equals(array[i], value)) return i;

    return -1;
}

I have three different instances which attempt a basic loop with array[i] == null, Equals(array[i], null), and using Array.IndexOf(null). I realize ticks and stopwatch time is relative to what's happening on the machine at the time, and the machine in general. But this is the code and benchmarks:

class Program
{
    static void Main(string[] args)
    {
        int?[] jankedArray;
        int missingElement = GenRandomizedArrayWithExtraEmptyElement(10000, out jankedArray);

        var sw = new Stopwatch();
        List<long> DotNetTicks = new List<long>();
        List<long> LikeDotNetTicks = new List<long>();
        List<long> BasicLoopTicks = new List<long>();
        IArrayAnalyzer arrayAnalyzer;
        for (var i = 0; i < 100000; i++)
        {
            arrayAnalyzer = new AnalyzerLikeDotNet();
            sw.Restart(); arrayAnalyzer.GetMissingElement(jankedArray); sw.Stop();
            LikeDotNetTicks.Add(sw.ElapsedTicks);

            arrayAnalyzer = new AnalyzerBasic();
            sw.Restart(); arrayAnalyzer.GetMissingElement(jankedArray); sw.Stop();
            BasicLoopTicks.Add(sw.ElapsedTicks);

            arrayAnalyzer = new AnalyzerUsingDotNet();
            sw.Restart(); arrayAnalyzer.GetMissingElement(jankedArray); sw.Stop();
            DotNetTicks.Add(sw.ElapsedTicks);
        }

        Console.WriteLine("LikeDotNet / DotNet = " + LikeDotNetTicks.Average() / DotNetTicks.Average());
        Console.WriteLine("Basic / DotNet = " + BasicLoopTicks.Average() / DotNetTicks.Average());

        Console.WriteLine("");
        Console.WriteLine("Press the Any key to continue");
        Console.ReadKey();
    }

    public static int GenRandomizedArrayWithExtraEmptyElement(int valueCount, out int?[] incompleteArray)
    {
        incompleteArray = new int?[valueCount + 1];
        Random random = new Random();

        int randomMissingIndex = random.Next(0, valueCount);

        var valueArray = new List<int>();
        for (var i = 1; i <= valueCount; i++) valueArray.Add(i);

        var arrayElementAt = 0;
        while (valueArray.Count > 0)
        {
            if (arrayElementAt != randomMissingIndex)
            {
                var randomElement = random.Next(0, valueArray.Count);
                var valueAtRandom = valueArray.ElementAt(randomElement);
                valueArray.RemoveAt(randomElement);
                incompleteArray[arrayElementAt] = valueAtRandom;
            }
            arrayElementAt++;
        }

        return randomMissingIndex;
    }
}

public interface IArrayAnalyzer
{
    int GetMissingElement(int?[] incompleteArray);
}

public class AnalyzerUsingDotNet : IArrayAnalyzer
{
    public int GetMissingElement(int?[] incompleteArray)
    {
        return Array.IndexOf(incompleteArray, null);
    }
}

public class AnalyzerLikeDotNet : IArrayAnalyzer
{
    public int GetMissingElement(int?[] array)
    {
        for (int i = 0; i < array.Length; i++)
            if (Equals(array[i], null)) return i;

        return -1;
    }
}

public class AnalyzerBasic : IArrayAnalyzer
{
    public int GetMissingElement(int?[] array)
    {
        for (int i = 0; i < array.Length; i++)
            if (array[i] == null) return i;

        return -1;
    }
}
  • Output:
  • LikeDotNet / DotNet = 81.3577324867023
  • Basic / DotNet = 3.29459064916075

What am I missing between AnalyzerLikeDotNet and AnalyzerUsingDotNet which makes the execution time so different?

Was it helpful?

Solution

On my machine, the result of the benchmark (compiled in optimized Release mode) is

 LikeDotNet / DotNet = 101.911379048464

 Basic / DotNet = 0.979227574248443

so the second is near to ratio of 1, which is a strong indication the internal implementation in the framework is very similar to your Basic case. The 2% difference are most probably just from the overhead of the additional method call, or from additional checks inside the framework.

Having a look at the link to the source code you posted it seems Array.IndexOf delegates it's calls to

  EqualityComparer<T>.Default.IndexOf

(line 1406), and the Default comparer is initialized by a CreateComparer method which picks one of several special EqualityComparer<T> derivations, optimized for some standard types. For int? this means the IndexOf method of NullableEqualityComparer<T> will be used, which uses .HasValue for null testing, not Equals. A quick benchmark with

    public int GetMissingElement(int?[] array)
    {
        for (int i = 0; i < array.Length; i++)
            if (!array[i].HasValue) return i;
        return -1;
    }

shows similar performance like your Basic test using == (or similar to calling Array.IndexOf directly). So I think this is what is really happening here - .HasValue has probably a much better performance (>factor 100) than Equals for int?.

OTHER TIPS

Value types do not provide an overload for == by default. However, most of the value types provided by the framework provide their own overload. The default implementation of Equals for a value type is provided by ValueType, and uses reflection to make the comparison, which makes it significantly slower than a type-specific implementation normally would be. This implementation also calls Equals on pairs of references within the two values being compared.

Link

Licensed under: CC-BY-SA with attribution
scroll top