Pergunta

I have the following method for looping through a table, changint some values in every row and saving the chages back to the database. To make things go faster I am fetching data in sets of 10,000 rows. This is a large table with over 25 million records in it.

The problem is that my application doesn't seems to be releasing any memory. I have tried redeclaring the records variable to nothing or explicitly caling the garbage collector but the memory stays there. Runnning the built-in VS10 profiler I can see that the culprit is the system.linq.enumerable.tolist() method which takes up over 98% of my memory. How do I release that memory after the call to saveChanges?

db = New databaseEntities
Dim size = 25000000
Dim stepSize = 10000
For i = 0 to size Step stepSize
  Dim sql = (From A In db.table).OrderBy(Function(A) A.Column).Skip(i).Take(stepSize)
  Dim records As New List(Of table)
  records = sql.ToList
  For Each record In records
    'do some work
  Next
  db.SaveChanges()
  records = Nothing
  GC.Collect()
Next
Foi útil?

Solução

The repository holds a reference to each entity it is tracking, so you won't be able to dispose an entity whilst the repository is live and tracking it. That means you either need to dispose the repository, or detach each entity after you've finished processing it.

Option 1) if "do some work" doesn't affect the order you'd return records in, you could move the creation of databaseEntities inside the For loop, and declare it with a using block. That should cause each block of entities to be release each time round the for loop

Option 2) If your operation is essentially parallel, and what you do to one "table" entity doesn't have any dependencies on any other, then you could call databaseEntities.Detach(record) after db.SaveChanges, which will enable the garbage collector to reclaim the entity's space.

Looking at your code, I suspect either of thse could be used

Outras dicas

maybe you can try this: (I didn't test it)

db.SaveChanges()
For Each record In records
    record.dispose ''only if class table got a dispose method
Next
records.clear
records = Nothing

I am by no means a Linq to SQL expert, but my guess is that the DataContext caches all rows you have read so you must either empty the cache or lose you reference to the DataContext.

If you don't need to update the entities, use MergeOption.NoTracking. The context will no longer keep a reference to the entity, nor will it do fixup.

Licenciado em: CC-BY-SA com atribuição
Não afiliado a StackOverflow
scroll top