Question

I'm facing a performance issue, using c# with EF5 and a SQL Server 2012 database (with 4gb RAM), while trying to insert thousands of items.

For example, this code took me 12s to execute in a MVC app, and 5s in a Windows console app, 99% of it consume by context.saveChanges() statement.

//Prepare list
        Random rand = new Random();
        List<MyItem> list = new List<MyItem>();

        for(int i = 0; i < 1000; i++) {
            list.Add(new MyItem {
                Field1 = i,
                Field2 = rand.Next(1000),
                Field3 = rand.Next(1000),
                Field4 = rand.Next(1000),
                Field5 = rand.Next(1000),
                Field6 = rand.Next(1000),
                Field7 = rand.Next(1000),
                Field8 = rand.Next(1000),
                Field9 = rand.Next(1000),
                Field10 = rand.Next(1000)
            });
        }



        Stopwatch watch = new Stopwatch();
        Stopwatch watch2 = new Stopwatch();

        watch.Start();

        using (var context = new MyEntities())
        {
            context.Configuration.AutoDetectChangesEnabled = false;
            context.Configuration.ValidateOnSaveEnabled = false;
            foreach (MyItem item in list)
            {    
                context.MyItem.Attach(item);
                context.Entry(item).State = System.Data.EntityState.Added;           
            }

            watch2.Start();
            context.SaveChanges();
            watch2.Stop();
        }
        watch.Stop();

I tried to turn off AutoDetectChangesEnabled and ValidateOnSaveEnabled but it seems that there is no performance gain. I also tried with a single insert stored procedure, but the performance where similar.

The table MyItem is a simple table with ten integer fields, one clustered primary key.

Any help will be appreciate !

Was it helpful?

Solution

Can we just start with "Entity Framework has no bulk insert". It generates stons of insert statements witz zero batching. Zero - one round trip per line, it does not even put multiple lines into a statement. In your example that is 1000 insert statements - but also 1000 separate round trips to the server process.

I have since long written extension methods for DbContext that allow me to use a method with the signature BulkInsert<T>(IEnumerable<T>) or BulkMerge<T>(IEnumerable<T>). About 5 pages of code.

The classes (T) are generally handcrafted to avoid overlap with the EF entities. I have no problems inserting in batches of 64000 in a second or so. I mainly use my own object data reader with a SqlBUlkCopy class to push the data into a temporary table then insert or merge into the final table - the temp table avoids the exclusive lock on the final table during the upload and the not too smart locking meachnisms in SqlBulkCopy.

But using pure EF for mass inserts - you can forget performance here. If you use search here you will find tons of complaints about this. EF is an ORM - it is not an ETL tool.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top