Question

I'm trying to parse through e-mails in Outlook 2007. I need to streamline it as fast as possible and seem to be having some trouble.

Basically it's:

foreach( Folder fld in outllookApp.Session.Folders )
{
    foreach( MailItem mailItem in fld )
    {
        string body = mailItem.Body;
    }
}

and for 5000 e-mails, this takes over 100 seconds. It doesn't seem to me like this should be taking anywhere near this long.

If I add:

string entry = mailItem.EntryID;

It ends up being an extra 30 seconds.

I'm doing all sorts of string manipulations including regular expressions with these strings and writing out to database and still, those 2 lines take 50% of my runtime.

I'm using Visual Studio 2008

Was it helpful?

Solution

Doing this kind of thing will take a long time as you having to pull the data from the exchange store for each item.

I think that you have a couple of options here..

Process this information out of band use CDO/RDO in some other process. Or Use MapiTables as this is the fastest way to get properties there are caveats with this though and you may be doing things in your processin that can be brought into a table.

Redemption wrapper - http://www.dimastr.com/redemption/mapitable.htm

MAPI Tables http://msdn.microsoft.com/en-us/library/cc842056.aspx

OTHER TIPS

I do not know if this will address your specific issue, but the latest Office 2007 service pack made a synificant performance difference (improvement) for Outlook with large numbers of messages.

Are you just reading in those strings in this loop, or are you reading in a string, processing it, then moving on to the next? You could try reading all the messages into a HashTable inside your loop then process them after they've been loaded--it might buy you some gains.

Any kind of UI updates are extremely expensive; if you're writing out text or incrementing a progress bar it's best to do so sparingly.

We had exactly the same problem even when the folders were local and there was no network delay.

We got 10x speedup by storing a copy of every email in a local Sql Server CE table tuned for the search we needed. We also used update events to make sure the local database remains in sync with the Outlook/Exchange folders.

To totally eliminate user lag we took the search out of the Outlook thread and put it in its own thread. The perception of lagging was worse than the actual delay it seems.

I had encountered a similar situation while trying to access Outlook mails via VBA(in excel). However, it was far more slower in my case: 1 E-mail per sec!(Maybe it was slower in mine than in your case due to the fact that I had it implemented on VBA).

Anyway, I successfully managed to improve the speed by using the SetColumnns(eg. https://docs.microsoft.com/en-us/office/vba/api/Outlook.Items.SetColumns)

I know.. I Know.. This only works for a few properties, like "Subject" and "ReceivedTime" and not for the body! But think again, do you really want to read through the body of all your emails? or is it just a subset? maybe based on its 'Subject' line or 'ReceivedTime'? My requirement was to just go into the body of the email in case its subject matched a specific string!

Hence, I did the below:

I had added a second 'Outlook.Items' obj called 'myFilterItemCopyForBody' and applied the same filter I had on the other 'Outlook.Items'. so, now I have two 'Outlook.Items' : 'myFilterItem' and 'myFilterItemCopyForBody' both with the same E-mail items since the same Restrict conditions are applied on both.

'myFilterItem'- to hold only 'Subject' and 'ReceivedTime' properties of the relevant mails (done by using SetColumns) 'myFilterItemCopyForBody'- to hold all the properties of the mail(including Body)

Now, both 'myFilterItem' and 'myFilterItemCopyForBody' are sorted with 'ReceivedTime' to have them in the same order.

Once sorted, both are looped simultaneously in a nested for each loop and pick corresponding properties (with the help of a counter) as in the code below.

Dim myFilterItem As Outlook.Items

Dim myItems As Outlook.Items
Set myItems = olFldr.Items

Set myFilterItemCopyForBody = myItems.Restrict("@SQL=""urn:schemas:httpmail:datereceived"" > '" & startTime & "' AND ""urn:schemas:httpmail:datereceived"" < '" & endTime & "'")
    Set myFilterItem = myItems.Restrict("@SQL=""urn:schemas:httpmail:datereceived"" > '" & startTime & "' AND ""urn:schemas:httpmail:datereceived"" < '" & endTime & "'")

myFilterItemCopyForBody.Sort ("ReceivedTime")
myFilterItem.Sort ("ReceivedTime")

myFilterItem.SetColumns ("Subject, ReceivedTime")

    For Each myItem1 In myFilterItem
        iCount = iCount + 1
        For Each myItem2 In myFilterItemCopyForBody
            jCount = jCount + 1
            If iCount = jCount Then
               'Display myItem2.Body if myItem1.Subject contain a specific string
                'MsgBox myItem2.Body
                jCount = 0
                Exit For
            End If
        Next myItem2
    Next myItem1

Note1: Notice that the Body property is accessed using the 'myItem2' corresponding to 'myFilterItemCopyForBody'.

Note2: The lesser the number of times the compiler enters the loop to access the body property, the better! You can further improve the efficiency by playing with the Restrict and the logic to lower down the number of times the compiler has to loop through the logic.

Hope this helps, even though this is not something new!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top