Question

In our sitecore 6.6.0 (rev. 130404) based project, we are required to migrate data from the old system's database to the sitecore database. We need to migrate around 650,000 objects. Each of these objects from the old database will create around 4 sitecore items in the master database. So it's a fairly large set of data being migrated.

We've hooked up sitecore APIs with a windows application and we run the data migration logic from that app. At the begining of the data migration, things are fairly fast, around 4 objects per second are transferred to sitecore master database. The first 10,000 objects only took 40 minutes. At this rate, one would predict that in 7 hours, 100,000 objects will be migrated.

But the problem is over time, things get increasingly and noticeably slow. After having around 100,000 objects migrated, now it takes around 7 hours to migrate just 30,000 objects. I even rebuilt sitecore database indexes time to time as mentioned in the performance tuning guide. We also don't perform any sitecore queries to find where to place the newly created sitecore items. No sitecore agents or lucene index update operations are running when our data migration is happening.

Here's the code at the beginning of the data migration logic:

using (new Sitecore.SecurityModel.SecurityDisabler())
using (new Sitecore.Data.Proxies.ProxyDisabler())
using (new Sitecore.Data.DatabaseCacheDisabler())
using (new Sitecore.Data.BulkUpdateContext())

Could the reason for this slowness be the growth of sitecore database indexes. I'm not an SQL expert but after some reading, I got a report on the index operational statistics. I'm not sure whether the numbers indicate the cause of our problem.

Index statistics (some tables were removed from the statistics report to save space)

Can anybody with better sitecore/sql knowledge than me, help on this?

edit: after bit more digging I got statistics for sql server latches (don't really understand those).

SQL server latch statistics

Thanks

Was it helpful?

Solution

After few days of tedious investigations I found out the root cause to this slowness. It was not because of database indexes. The problem was Database.GetItem(<item path>) method calls inside the sitecore MediaCreator class. (Our data migration includes creation of image items)

In the sitecore tree of our website, some items have quite a large number (tens of thousands) of children under them. Allthough it's not recommended having large no. items in sitecore, that's the correct design for our project. If we do a GetItem(<item path>) call to one of these child items, it takes a long time to return that item. Obviously GetItem() using the item path is much slower than getting by ID. Unfortunately we don't have any control over this situation because sitecore MediaCreator uses item paths to create media items.

By using dotPeek I was able to investigate sitecore source code and created a version of MediaCreator class that didn't use item paths for GetItem() and the data migration began to run fast.

I'm going to ask from the sitecore forum whether there are any ways to overcome this performance issue without duplicating MediaCreator source code.

OTHER TIPS

The first things you should look at are:

  1. Disable all indexes during the migration

  2. Wrap the your custom logic into: SecurityDisabler(), EventDisabler(), ProxyDisabler()

  3. SQL server performance might be the problem - make sure to set proper values for database growth - https://www.simple-talk.com/sql/database-administration/sql-server-database-growth-and-autogrowth-settings/

Also, see similar question here: Optimisation tips when migrating data into Sitecore CMS

You can hash the media creator path into a unique guid. Then you can likely use guids as lookup values.

also don't forget to run DB jobs that "defragment" your db indexes (SQL job, I forgot proper name of index maintenance, but it is hugely important).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top