Question

Would love some help figuring out why a script is running much slower than it used to. The script starts sequential Matlab simulations and saves each simulation's output to a file in a directory on computer #1. The script is running on computers #2, 3, and 4 which have the C: drive of computer #1 mounted as drive K:, and the computers read and write K: drive files during the simulations. Prior to starting each simulation, the script saves a 'placeholder' version of the simulation's output file which later gets overwritten with that simulation's results once the simulation is complete. The output filename is unique to that simulation. The script checks for the output file before starting a simulation; if the file is found, it goes to the next simulation. The intent is to divide up many simulations among the different computers. The directory on computer #1 has many files in it (~4000, 6GB) and computer #1 is an old windows XP machine. Computers #2-4 are also windows machines and are 2+ years old.

This scheme used to work fine, saving ~3 files per minute. Now it is taking ~15 minutes per file. What might be the leading cause for the slowdown? Could it be the number of files in the directory or the number of computers accessing computer #1? If that is unlikely, I would like to know so I can redirect my troubleshooting.

Était-ce utile?

La solution 2

It turns out the problem was an old network switch that the various computers were plugged into. When we tried a newer switch, the script ran like lightning.

However everyone's suggestions (subdirectories to reduce # of files; defragging computer #1 which turned out to be badly fragmented) were very helpful, and it was great to have some other eyes on the problem, so thanks.

Autres conseils

The number of items in a single directory absolutely leads to decreased performance. I've read that it depends on OS, filesystem, phase of the moon, local/remote drives ... maybe phase of the moon.

My personal rule of thumb is that at about 5,000 items per directory performance starts to degrade, and at about 10,000 performance has degraded enough that whatever you are doing will not work correctly anymore.

Licencié sous: CC-BY-SA avec attribution
Non affilié à StackOverflow
scroll top