Pregunta

Size: ~5mb

Size on Disk: ~3gb

We're using C# and saving data constantly as it changes, all of the file data has to be accessible at any given time. Basically if something changes the file for that data must save. This is why there are so many files for so much data. The data is processed greatly as well so clumping all of it together is not an option as a minor change would result in a large amount being saved for no reason. These files already contain enough that saving one is mostly redundant for only a small change.

Surely there is a way to get around this absurd expansion of the file size, and still retain the accessibility and saving-efficiency we have achieved. We need a way to package these files into what windows will consider to be a single file, but in such a way that we do not have to rewrite the entire file when something changes.

I understand that having thousands of small files is quite strange, but for our purposes it has improved performance greatly. We just don't want to sacrifice one resource for another if it is at all possible to avoid.

Note: The files have RLE binary data, they are not text files.

Clarity update: 5mb->3gb = 250mb (50x clusters) -> 150gb = PROBLEM!

¿Fue útil?

Solución

A database does exactly what you need: You can store arbitrary amounts of tiny rows/blobs and they will be stored efficiently. File systems typically require at least one disk cluster per file which is probably why your size expands so much. Databases don't do that. You can also ask the database to compact itself.

There are embedded and standalone databases available.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top