Question

As said in the title, I'm wondering if the -k option (strongly) affects the speed of GNU parallel.

In man parallel_tutorial there is a discussion about --ungroup and --line-buffer, which claims that --linebuffer, which unmixes output lines, is much slower than --ungroup. So maybe -k will also result in major slowdown when the job count is large?

(I didn't find this topic in man parallel or man parallel_tutorial; neither did I find anything with some Google. I haven't finished man parallel though, so if I missed something with less search, please excuse me.)

Was it helpful?

Solution

-k does not slow anything down, but it needs 4 file handles for each job. If GNU Parallel runs out of file handles, it will wait until one of the running jobs finishes.

-g compared to -u slows down around 1-2 milliseconds per job (plus the time it takes to write and read back the output from disk), so the slow down will only be noticeable if you run very short jobs or jobs with much output.

--line-buffer can be faster and can be slower than -g. It does not buffer on disk, but it takes more CPU time to run - especially if your jobs output data slowly.

My recommendation would be to use what is easiest for you to use, and only if that proves to be too slow, look into the other options.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top