Question

What is the difference between threading and parallelism?

Which one has advantage over the other?

Was it helpful?

Solution

Daniel Moth (a former coworker of mine)- Threading/Concurrency vs Parallelism article explains it all.

Quoted:

To take advantage of multiple cores from our software, ultimately threads have to be used. Because of this fact, some developers fall in the trap of equating multithreading to parallelism. That is not accurate...You can have multithreading on a single core machine, but you can only have parallelism on a multi core machine

The quick test: If on a single core machine you are using threads and it makes perfect sense for your scenario, then you are not "doing parallelism", you are just doing multithreading.

OTHER TIPS

Parallelism is a general technique of using more than one flow of instructions to complete a computation. The critical aspect of all parallel techniques is communicating between flows to collaborate a final answer.

Threading is a specific implementation of parallelism. Each flow of instructions is given it's own stack to keep a record of local variables and function calls, and communicates with the other flows implicitly by shared memory.

One example might be to have one thread simply queue up disk requests and pass it to a worker thread, effectively parallelizing disk and CPU. The traditional UNIX pipes method is to split these into two complete programs, say "cat" and grep in the command:

cat /var/log/Xorg.0.log | grep "EE"

Threading could conceivably reduce the communication costs of copying disk I/O from the cat process to the grep process.

Threading is usually referred to having multiple processes working at the same time on a single CPU (well actually not you think they do but they switch very fast between them).

Parallelism is having multiple processes working at the same time on multiple CPU's.

Both have their pros and cons heavily depending on the scheduler used by your operating system. Usually the computation cost of creating a thread is much lower then spawning a process on another CPU, however having a 'whole' CPU for yourself increases the overall speed of that process. But then again if that process needs to communicate with another process on another CPU you need to solve the IPC (inter process communication) problem which might be such an overhead that it is effectively better to just use a thread on the same CPU.

Most operating system are aware of multiple CPU's/Cores and can use them, but this makes the scheduler usually quite complex.

If your are programming in a language that uses a VM (virtual machine), be aware that they need to implement their own scheduler (if at all). Python for example uses a GIL, which pretty much says that everything running on that VM stays on the same CPU, always. Though some OS's are capable of migrating a heavy process to another CPU that isn't so busy at the moment, which of course means that the whole process needs to be paused while it is doing that.

Some operating systems like DragonFlyBSD take a whole different approach to scheduling then what at this moment is the 'standard' approach.

I think this answer gives you enough keywords to search for more information :-)

Threading is a technology, parallelism is a paradigm that may be implemented using threading (but could just as easily be done using single threads on multiple processors)

Here is the best answer to clear out anyone's doubts related to parallelism and threading.

Threads are a software construct. I can start as many pthreads as I want, even on an old single core processor. So multi-threading is not necessarily parallel: it's only parallel if the hardware can support it. So if you have multiple cores and/or hyperthreading, your multi-threading becomes parallel. And these days that is in fact most of the time.

Concurrency is about activities that have no clear temporal ordering. So again, if the hardware supports it, they can be done in parallel, if not, not.

So, traditionally multi-threading is almost synonymous with concurrency. And both of them only become parallel if the hardware supports it. Even then you can start many more threads than the hardware supports, and you are left with concurrency.

From an answer by Victor Eijkhout on Quora.

How do you define "parallelism"? Multithreading is a concrete implementation of the concept of parallel program execution.

The article RichardOD linked to seems to be mainly concerned with whether threads are actually executed in parallel on a concrete machine.

However, your question seems to see multithreading and parallelism as opposites. Do you perhaps mean programs that use multiple processes rather than multiple threads? If so, the differences are:

  • Threads are much cheaper to create than processes. This is why using threads rather than processes resulted in a huge speedup in web applications - this was called "FastCGI".
  • Multiple threads on the same machine have access to shared memory. This makes communication between threads much easier, but also very dangerous (it's easy to create bugs like race conditions that are very hard to diagnose and fix).

Threading is a poor man's parallelism.

EDIT: To be more precise:

Threading has nothing to do with parallelism and wise versa. Threading is about making feel that some processes run in parallel. However, this doesn't make processes to complete ALL their actions any faster in total.

If we think CPU as a company and threads as its workers then, it help us to understand threading and parallelism more easily.

Like a company have many workers, the CPU also have many threads.

Also there may be more than one company and therefore there may be more than one CPU's.

Therefore when workers(threads) work in a company(CPU), it is called threading.

And when two or more companies(CPU) work independently or together, it is called parallelism.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top