Question

Clearly if performance is critical it makes sense to prototype and profile. But all the same, wisdom and advice can be sought on StackOverflow :)

For the handling of highly parallel tasks where inter-task communication is infrequent or suits message-passing, is there a performance disadvantage to using processes (fork() etc) or threads?

Is the context switch between threads cheaper than that between processes? Some processors have single-instruction context-switching don't they? Do the mainstream operating systems better utilise SMP with multiple threads or processes? Is the COW overhead of fork() more expensive than threads if the process never writes to those pages?

And so on. Thanks!

Was it helpful?

Solution

The idea that processes are slow to create is an old one, and was much more true in the past. Google's Chrome team did a little paragraph somewhere about how it's not as big an impact anymore, and here is Scott Hanselman on the subject: http://www.hanselman.com/blog/MicrosoftIE8AndGoogleChromeProcessesAreTheNewThreads.aspx

My take on it is that threads are faster?'c but only moderately so, and currently it's easier to make mistakes with threads.

I have heard that .NET 4.0 is going to extend the thread library... Something about system.threading.thread.For ? And I can think of a few places I'd want to do that... For each item in this thousand item list go do something.

http://reedcopsey.com/?p=87

OTHER TIPS

At the following URL you will find a real world benchmark and a comparison of fork vs. pthread_create in a real world application, though its from 2003 and things may have changed a bit. Quickly reasoning from this benchmark, it looks like fork scales better if you have more than 500 processes or threads.

http://bulk.fefe.de/scalable-networking.pdf - pages 29 to 32

My guess would be, that threads are faster, since they are the more lightweight solution. Processes are designed to be isolated from each other. Each process uses it's own TLB, whereas threads share one virtual address space (afaik), so this could be an argument. Processes are usefull if you want to do some kind of distributed computing.

In general about threading and stuff, I suggest you look into OpenMP or Intel-TBB. These guys really know their stuff with multithreading and high performance computing.

It comes down to the isolation cost: processes are isolated from each other (e.g. separate memory resources, protection, separate file handles etc.) whereas threads can shared resources within a process. It takes time & resources to support & enforce this isolation.

As with anything in this universe, you have to "pay" for what you get.

According to this book: http://reiber.org/nxt/pub/Linux/LinuxKernelDevelopment/Linux.Kernel.Development.3rd.Edition.pdf Linux implements all threads as standard processes. Considering you're writing about COW - that's linux. However more on this on pages 33-34.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top