Question

I have 10 threads writing thousands of small buffers (16-30 bytes each) to a huge file in random positions. Some of the threads throw OutOfMemoryException on FileStream.Write() opreation.

What is causing the OutOfMemoryException ? What to look for?

I'm using the FileStream like this (for every written item - this code runs from 10 different threads):

using (FileStream fs = new FileStream(path, FileMode.OpenOrCreate, FileAccess.Write, FileShare.ReadWrite, BigBufferSizeInBytes, FileOptions.SequentialScan))
{
 ...
 fs.Write();
}

I suspect that all the buffers allocated inside the FileStream don't get released in time by the GC. What I don't understand is why the CLR, instead of throwing, doesn't just run a GC cycle and free up all the unused buffers?

No correct solution

OTHER TIPS

If ten threads are opening files as your code shows, then you have a maximum of ten undisposed FileStream objects at any one time. Yes, FileStream does have an internal buffer, the size of which you specify with "BigBufferSizeInBytes" in your code. Could you please disclose the exact value? If this is big enough (e.g. ~100MB) then it could well be the source of the problem.

By default (i.e. when you don't specify a number upon construction), this buffer is 4kB and that is usually fine for most applications. In general, if you really care about disk write performance, then you might increase this one to a couple of 100kB but not more.

However, for your specific application doing so wouldn't make much sense, as said buffer will never contain more than the 16-30 bytes you write into it before you Dispose() the FileStream object.

To answer your question, an OutOfMemoryException is thrown only when the requested memory can't be allocated after a GC has run. Again, if the buffer is really big then the system could have plenty of memory left, just not a contiguous chunk. This is because the large object heap is never compacted.

I've reminded people about this one a few times, but the Large object heap can throw that exception fairly subtially, when seemingly you have pleanty of available memory or the application is running OK.

I've run into this issue fairly frequently when doing almost exactally what your discribing here.

You need to post more of your code to answer this question properly. However, I'm guessing it could also be related to a potential Halloween problem (Spooky Dooky) .

Your buffer to which you are reading from may also be the problem (again large object heap related) also again, you need to put up more details about what's going on there in the loop. I've just nailed out the last bug I had which is virtually identical (I am performing many parallel hash update's which all require independent state to be maintained across read's of the input file)....

OOP! just scrolled over and noticed "BigBufferSizeInBytes", I'm leaning towards Large Object Heap again...

If I were you, (and this is exceedingly difficult due to the lack of context), I would provide a small dispatch "mbuf", where you copied in and out instead of allowing all of your disperate thread's to individually read across your large backing array... (i.e. it's hard to not cause insadential allocations with very subtile code syntax).

Buffers aren't generally allocated inside the FileStream. Perhaps the problem is the line "writing thousands of small buffers" - do you really mean that? Normally you re-use a buffer many, many, many times (i.e. on different calls to Read/Write).

Also - is this a single file? A single FileStream is not guaranteed to be thread safe... so if you aren't doing synchronization, expect chaos.

It's possible that these limitations arise from the underlying OS, and that the .NET Framework is powerless to overcome these kind of limitations.

What I cannot deduce from your code sample is whether you open up a lot of these FileStream objects at the same time, or open them really fast in sequence. Your use of the 'using' keyword will make sure that the files are closed after the fs.Write() call. There's no GC cycle required to close the file.

The FileStream class is really geared towards sequential read/write access to files. If you need to quickly write to random locations in a big file, you might want to take a look at using virtual file mapping.

Update: It seems that virtual file mapping will not be officially supported in .NET until 4.0. You may want to take a look at third party implementations for this functionality.

Dave

I'm experiencing something similar and wondered if you ever pinned down the root of your problem?

My code does quite a lot of copying between files, passing quite a few megs between different byte files. I've noticed that whilst the process memory usage stays within a reasonable range, the system memory allocation shoots up way too high during the copying - much more than is being used by my process.

I've tracked the issue down to the FileStream.Write() call - when this line is taken out, memory usage seems to go as expected. My BigBufferSizeInBytes is the default (4k), and I can't see anywhere where these could be collecting...

Anything you discovered whilst looking at your problem would be gratefully received!

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow