Question

I'm having a hard time wrapping my head around Python threading, especially since the documentation explicitly tells you to RTFS at some points, instead of kindly including the relevant info. I'll admit I don't feel qualified to read the threading module. I've seen lots of dirt-simple examples, but they all use global variables, which is offensive and makes me wonder if anyone really knows when or where it's required to use them as opposed to just convenient.

In particular, I'd like to know:

  • In threading.Thread(target=x), is x shared or private? Does each thread have its own stack, or are all threads using the same context simultaneously?
  • What is the preferred way to pass mutable variables to threads? Immutable ones are obviously through Thread(args=[],kwargs={}) and that's what all the examples cover. If it's global, I'll have to hold my nose and use it, but it seems like there has to be a better way. I suppose I could wrap everything in a class and just pass the instance in, but it'd be nice to point at regular variables, too.
  • When do I need threading.local()? In the x above?
  • Do I have to subclass Thread to update data, as many examples show?

I'm used to Win32 threads and pthreads, where it's explicitly laid out in docs what is and isn't shared with different uses of threads. Those are pretty low-level, and I'd like to avoid _thread if possible to be pythonic.

I'm not sure if it's relevant, but I'm trying to thread OpenMP-style to get the hang of it - make a for loop run concurrently using a queue and some threads. It was easy with globals and locks, but now I'd like to nail down scopes for better lock use.

Was it helpful?

Solution

In threading.Thread(target=x), is x shared or private?

It is private. Each thread has its own private invocation of x.

This is similar to recursion, for example (regardless of multithreading). If x calls itself, each invocation of x gets its own "private" frame, with its own private local variables.

What is the preferred way to pass mutable variables to threads? Do I have to subclass Thread to update data?

I view the target argument as a quick shortcut, good for quick experiments, but not much else. Using it where it ought not be used leads to all the limitations you describe in your question (and the hacks you describe in the possible solutions you contemplate).

Most of the time, you'd want to subclass threading.Thread. The code creating/managing the threads would pass all mutable shared objects to your thread-classes' __init__, and they should keep those objects as their attributes, and access them when running (within their run method).

When do I need threading.local()?

You rarely do, so you probably don't.

I'd like to avoid _thread if possible to be pythonic

Without a doubt, avoid it.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top