Question

How good/fast is Excel VBA's Find vs. binary search? My platform is Office 11|2003 and I'll be searching for strings against Column A on three sheets of values. Total number of rows ~140,000

If worth it which Library & functions should I reference to do the sorting and then the binary search? Binary searching strings/text reportedly has potential problems.

... one thing must be noted. Using binary search formulas with sortedtextrequires caution. Aladin A., Excel MVP

Excel Find:

Worksheets(1).Range("A:A").Find("PN-String-K9", LookIn:=xlValues, LookAt:=xlWhole)
Was it helpful?

Solution

Much against my intuition a VBA binary search strongly outperforms an Excel Find. At least with the scenario below where 120,000 6 character strings are distributed evenly over 3 worksheets.

Excel Find takes 1 minute 58 seconds,
VBA binary search takes 36 seconds on my particular machine.

The advantage of knowing that the text is in order obviously outweighs Excel's natural advantage. Note Aladin A's warning about sort order.

Option Explicit

' Call Search to look for a thousand random strings
' in 3 worksheets of a workbook

' requires a workbook with 3 sheets and
' column A populated with values between "00001" to "120000"
' split evenly 40,000 to a worksheet in ascending order.
' They must be text, not numbers.

Private Const NUM_ROWS As Long = 120000
Private Const SHEET_1 As String = "Sheet1"
Private Const SHEET_2 As String = "Sheet2"
Private Const SHEET_3 As String = "Sheet3"

' This uses VBA Binary Search
Public Sub Search()
    Worksheets(SHEET_1).Range("B:B").ClearContents
    Worksheets(SHEET_2).Range("B:B").ClearContents
    Worksheets(SHEET_3).Range("B:B").ClearContents
    DoSearch True       ' change to False to test Excel search
End Sub

' Searches for a thousand values using binary  or excel search depending on
' value of bBinarySearch
Public Sub DoSearch(ByVal bBinarySearch As Boolean)
    Debug.Print Now
    Dim ii As Long

    For ii = 1 To 1000
        Dim rr As Long
        rr = Int((NUM_ROWS) * Rnd + 1)
        If bBinarySearch Then
            Dim strSheetName As String
            Dim nRow As Long
            If BinarySearch(MakeSearchArg(rr), strSheetName, nRow) Then
                Worksheets(strSheetName).Activate
                Cells(nRow, 1).Activate
            End If
        Else
            If Not ExcelSearch(SHEET_1, MakeSearchArg(rr)) Then
                If Not ExcelSearch(SHEET_2, MakeSearchArg(rr)) Then
                    ExcelSearch SHEET_3, MakeSearchArg(rr)
                End If
            End If
        End If
        ActiveCell.Offset(0, 1).Value = "FOUND"
    Next
    Debug.Print Now

End Sub

' look for one cell value using Excel Find
Private Function ExcelSearch(ByVal strWorksheet As String _
  , ByVal strSearchArg As String) As Boolean
    On Error GoTo Err_Exit
    Worksheets(strWorksheet).Activate
    Worksheets(strWorksheet).Range("A:A").Find(What:=strSearchArg, LookIn:=xlValues, LookAt:= 
        xlWhole, SearchOrder:=xlByRows, SearchDirection:=xlNext, MatchCase:=True 
        , SearchFormat:=False).Activate
    ExcelSearch = True
    Exit Function
Err_Exit:
    ExcelSearch = False
End Function

' Look for value using a vba based binary search
' returns true if the search argument is found in the workbook
' strSheetName contains the name of the worksheet on exit and nRow gives the row
Private Function BinarySearch(ByVal strSearchArg As String _
  , ByRef strSheetName As String, ByRef nRow As Long) As Boolean
    Dim nFirst As Long, nLast As Long
    nFirst = 1
    nLast = NUM_ROWS
    Do While True
        Dim nMiddle As Long
        Dim strValue As String
        If nFirst > nLast Then
            Exit Do     ' Failed to find search arg
        End If
        nMiddle = Round((nLast - nFirst) / 2 + nFirst)
        SheetNameAndRowFromIdx nMiddle, strSheetName, nRow
        strValue = Worksheets(strSheetName).Cells(nRow, 1)
        If strSearchArg < strValue Then
            nLast = nMiddle - 1
        ElseIf strSearchArg > strValue Then
            nFirst = nMiddle + 1
        Else
            BinarySearch = True
            Exit Do
        End If
    Loop
End Function

' convert 1 -> "000001", 120000 -> "120000", etc
Private Function MakeSearchArg(ByVal nArg As Long) As String
    MakeSearchArg = Right(CStr(nArg + 1000000), 6)
End Function

' converts some number to a worksheet name and a row number
' This is depenent on the worksheets being named sheet1, sheet2, sheet3

' and containing an equal number of vlaues in each sheet where
' the total number of values is NUM_ROWS
Private Sub SheetNameAndRowFromIdx(ByVal nIdx As Long _
  , ByRef strSheetName As String, ByRef nRow As Long)
    If nIdx <= NUM_ROWS / 3 Then

        strSheetName = SHEET_1
        nRow = nIdx
    ElseIf nIdx > (NUM_ROWS / 3) * 2 Then
        strSheetName = SHEET_3
        nRow = nIdx - (NUM_ROWS / 3) * 2
    Else
        strSheetName = SHEET_2
        nRow = nIdx - (NUM_ROWS / 3)
    End If
End Sub

OTHER TIPS

I find using the AutoFilter works a lot faster than manually searching the records with any method.

I filter, check to see if there are any results, then move on. If any are found (by checking the count of results), I can search the small portion that are filtered manually or return them all.

I used it on around 44,000 records, searching for a list of 100+ parts against it.

Binary searches can easily get stuck in infinite loops if you're not careful.

If you use vlookup with the sorted option, it will likely be faster than your vba.

I have become interested in this because I was using the .Find function, and on one PC it failed to work on some lookups, but on another it was OK! So I've done some testing on timings - I've got a sheet with 985 names sorted into order, and I've written a small subroutine to run through them, and look each one up in the same list using a different method (the times are in milliseconds):

  1. Brute force: 2000
  2. .Find: 750
  3. Application.VLookup: 265
  4. Binary Search: 234

The problem with VLookup is that it can't return the row number unless you include it in your table.

Here is my code for the Binary search, I've assumed that the sheet has a header row, but you could easily modify the header and code to pass that information. The optional Col parameter is used to indicated if you want the row number, or a cell's value. The function returns 0 (zero) if the find fails.

Function Find(Sheet As Worksheet, What As String, Optional Col As Long = 0) As Variant
Dim Top As Long
Dim Mid As Long
Dim Bot As Long 'Bottom
Dim S As String
Dim T As String

   With Sheet
     Top = 2 'Sheet has a header row
     Bot = .UsedRange.Rows.Count
     S = LCase(What)
     Do
       Mid = (Top + Bot) / 2
       T = LCase(.Cells(Mid, 1))
       Select Case True
       Case T > S
         Bot = Mid - 1
       Case T < S
         Top = Mid + 1
       Case Else 'T = S
         If Col = 0 Then
           Find = Mid  'Return the row
         Else
           Find = .Cells(Mid, Col).Value2 'Return the cell's value
         End If
         Exit Function
       End Select
     Loop Until Bot < Top
   End With
   Find = 0
End Function
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top