After few days of tedious investigations I found out the root cause to this slowness. It was not because of database indexes. The problem was Database.GetItem(<item path>)
method calls inside the sitecore MediaCreator
class. (Our data migration includes creation of image items)
In the sitecore tree of our website, some items have quite a large number (tens of thousands) of children under them. Allthough it's not recommended having large no. items in sitecore, that's the correct design for our project. If we do a GetItem(<item path>)
call to one of these child items, it takes a long time to return that item. Obviously GetItem()
using the item path is much slower than getting by ID. Unfortunately we don't have any control over this situation because sitecore MediaCreator uses item paths to create media items.
By using dotPeek I was able to investigate sitecore source code and created a version of MediaCreator class that didn't use item paths for GetItem()
and the data migration began to run fast.
I'm going to ask from the sitecore forum whether there are any ways to overcome this performance issue without duplicating MediaCreator
source code.