Pregunta

I am writing a program which iterates through the file system multiple times using simple loops and recursion.

The problem is that, because I am iterating through multiple times, it is taking a long time because (I guess) the hard drive can only work at a certain pace.

Is there any way to optimize this process? Maybe by iterating though once, saving all the relevant information in a collection and then referring to the collection when I need to?

I know I can cache my results like this but I have absolutely no idea how to go about it.

Edit:

There are three main pieces of information I am trying to obtain from a given directory:

  • The size of the directory (the sum of the size of each file within that directory)
  • The number of files within the directory
  • The number of folders within the directory

All of the above includes sub-directories too. Currently, I am performing an iteration of a given directory to obtain each piece of information, i.e. three iterations per directory.

My output is basically a spreadsheet which looks like this:

Program Output

¿Fue útil?

Solución

To improve performance, you could access the Master File Table (MFT) of the NTFS file system directly. There is a excellent code sample on MSDN social forum. It seems that accessing the MFT is about 10x faster than enumerating the file system using FindFirst/FindNext file.

Hope, this helps.

Otros consejos

Yes anything you can do to minimize hard drive I/O will improve the performance. I would also suggest putting in a Stopwatch and measure the time it takes so you can get a sense of how your improvements are affecting the speed.

Licenciado bajo: CC-BY-SA con atribución
No afiliado a StackOverflow
scroll top