Question

List1 in the following example is a SortedList(Of MyClass) and contains 251 members.

The first two codeblocks execute in 15.5 seconds.

 For cnt As Integer = 1 To 1000000
        For Each TempDE In List1
            Dim F As String = TempDE.Key
            TempDE.Value.x1 = 444
        Next
    Next

 

    For cnt As Integer = 1 To 1000000
        For Each TempDE As KeyValuePair(Of String, phatob) In List2
            Dim F As String = TempDE.Key
            TempDE.Value.x1 = 444
        Next
    Next

This one executes in 5.6 seconds.

    For cnt As Integer = 0 To 999999
        For cnt2 As Integer = 0 To 250
            Dim F As String = List1.Keys(cnt2)
            List1.Values(cnt2).x1 = 444
        Next

    Next

Why are the first two codeblocks so much slower?

Was it helpful?

Solution

The SortedList extends a Collection by implementing IComparer to provide the sort functionality. Internally, it implements 2 arrays to store the elements of the list - one array for the key and one for the values. The .NET Array is optimised for fast in-order and fast random access.

My suspicion why the first 2 are slow is because the foreach statement in a SortedList is a wrapper around the Enumerator. Calling foreach will query for the enumerator, call MoveNext and Current. In addition, going though the generic list can potentially involve boxing and unboxing as you traverse the list, and can create performance overhead that you would not normally get by accessing by Index.

OTHER TIPS

I tried to look around for some documentation on exactly how For Each behaves, but I couldn't find it.

My theory is that using the For Each statements copies the object in the list to another spot in memory and then copies it back into the list when each iteration of the loop ends.

Another possibility is that it calls the constructor at the start of every iteration and then deconstructs and calls the constructor again to reset for the next iteration.

I'm not sure on either of these theories, but the major difference between 3 and (1 or 2) is the lack of For Each.

EDIT: Found some documentation at MSDN.

Here's an excerpt:

When execution of the For Each...Next loop begins, Visual Basic verifies that group refers to a valid collection object. If not, it throws an exception. Otherwise, it calls the MoveNext method and the Current property of the enumerator object to return the first element. If MoveNext indicates that there is no next element, that is, if the collection is empty, then the For Each loop terminates and control passes to the statement following the Next statement. Otherwise, Visual Basic sets element to the first element and runs the statement block.

So overall it sounds like For Each is more "managed" and does a lot of overhead to make sure everything matches up. As a result, it's slower.

I would think the compiler can better optimize block 3 because of the fixed loop range. In blocks one and 2 the compiler won't know what the upper limit of the loop is until it evaluates the List thus making it slower.

My random guess: List1 contains ~750 elements (and not just 250). Your third case is faster, because it does not iterate over each element which List1 has.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top