Question

I am working on a class which operates in a multithreaded environment, and looks something like this (with excess noise removed):

class B:

    @classmethod
    def apply(cls, item):
        cls.do_thing(item)

    @classmethod
    def do_thing(cls, item)
        'do something to item'

    def run(self):
        pool = multiprocessing.Pool()
        for list_of_items in self.data_groups:
            pool.map(list_of_items, self.apply)

My concern is that two threads might call apply or do_thing at the same time, or that a subclass might try to do something stupid with cls in one of these functions. I could use staticmethod instead of classmethod, but calling do_thing would become a lot more complicated, especially if a subclass reimplements one of these but not the other. So my question is this: Is the above class thread-safe, or is there a potential problem with using classmethods like that?

Was it helpful?

Solution

Whether a method is thread safe or not depends on what the method does.

Working with local variables only is thread safe. But when you change the same non local variable from different threads, it becomes unsafe.

‘do something to item’ seems to modify only the given object, which is independent from any other object in the list, so it should be thread safe.

If the same object is in the list several times, you may have to think about making the object thread safe. That can be done by using with self.object_scope_lock: in every method which modifies the object.

Anyway, what you are doing here is using processes instead of threads. In this case the objects are pickled and send through a pipe to the other process, where they are modified and send back. In contrast to threads processes do not share memory. So I don’t think using a lock in the class-method would have an effect.

http://docs.python.org/3/library/threading.html?highlight=threading#module-threading

OTHER TIPS

There's no difference between classmethods and regular functions (and instance methods) in this regard. Neither is automagically thread-safe.

If one or more classmethods/methods/functions can manipulate data structures simultaneously from different threads, you'd need to add synchronization protection, typically using threading.Locks.

Both other answers are technically correct in that the safety of do_thing() depends on what happens inside the function.

But the more precise answer is that the call itself is safe. In other words if apply()and do_thing()are a pure functions, then your code is safe. Any unsafe-ness would be due to them not being pure functions (e.g. relying on or affecting a shared variable during execution)

As shx2 mentioned, classmethods are only "in" a class visually, for grouping. They have no inherent attachment to any instance of the class. Therefore this code is roughly equivalent in functioning:

def apply(item):
    do_thing(item)

def do_thing(item)
    'do something to item'

class B:
    def run(self):
        pool = multiprocessing.Pool()
        for list_of_items in self.data_groups:
            pool.map(list_of_items, apply)

A further note on concurrency given the other answers:

  1. threading.Lock is easy to understand, but should be your last resort. In naive implementations it is often slower than completely linear processing. Your code will usually be faster if you can use things like threading.Event, queue.Queue, or multiprocessing.Pipe to transfer information instead.
  2. asyncio is the new hotness in python3. It's a bit more difficult to get right but is generally the fastest method.
  3. If you want a great walkthrough modern concurrency techniques in python check out core developer Raymond Hettinger's Keynote on Concurrency. The whole thing is great, but the downside of lockis highlighted starting at t=57:59.
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top