Domanda

Running it on Ubuntu 14 with Python 2.7.6

I simplified script to show my problem:

import time
import multiprocessing

data = range(1, 3)
start_time = time.clock()


def lol():
    for i in data:
        print time.clock() - start_time, "lol seconds"


def worker(n):
    print time.clock() - start_time, "multiprocesor seconds"


def mp_handler():
    p = multiprocessing.Pool(1)
    p.map(worker, data)

if __name__ == '__main__':
    lol()
    mp_handler()

And the output:

8e-06 lol seconds
6.9e-05 lol seconds
-0.030019 multiprocesor seconds
-0.029907 multiprocesor seconds

    Process finished with exit code 0

Using time.time() gives non-negative values (as marked here Timer shows negative time elapsed) but I'm curious what is the problem with time.clock() in python multiprocessing and reading time from CPU.

È stato utile?

Soluzione

multiprocessing spawns new processes and time.clock() on linux has the same meaning of the C's clock():

The value returned is the CPU time used so far as a clock_t;

So the values returned by clock restart from 0 when a process start. However your code uses the parent's process start_time to determine the time spent in the child process, which is obviously incorrect if the child processes CPU time resets.

The clock() function makes sense only when handling one process, because its return value is the CPU time spent by that process. Child processes are not taken into account.

The time() function on the other hand uses a system-wide clock, and thus can be used even between different processes (although it is not monotonic, so it might return wrong results if somebody changes the system time during the events).

Forking a running python instance is probably faster then starting a new one from scratch, hence start_time is almost always bigger then the value returned by time.clock(). Take into account that the parent process also had to read your file on disk, perform the imports which may require reading other .py files, searching directories etc. The forked child processes don't have to do all that.


Example code that shows that the return value of time.clock() resets to 0:

from __future__ import print_function

import time
import multiprocessing

data = range(1, 3)
start_time = time.clock()


def lol():
    for i in data:
        t = time.clock()
        print('t: ', t, end='\t')
        print(t - start_time, "lol seconds")


def worker(n):
    t = time.clock()
    print('t: ', t, end='\t')
    print(t - start_time, "multiprocesor seconds")


def mp_handler():
    p = multiprocessing.Pool(1)
    p.map(worker, data)

if __name__ == '__main__':
    print('start_time', start_time)
    lol()
    mp_handler()

Result:

$python ./testing.py 
start_time 0.020721
t:  0.020779    5.8e-05 lol seconds
t:  0.020804    8.3e-05 lol seconds
t:  0.001036    -0.019685 multiprocesor seconds
t:  0.001166    -0.019555 multiprocesor seconds

Note how t is monotonic for the lol case while goes back to 0.001 in the other case.

Altri suggerimenti

To add a concise Python 3 example to Bakuriu's excellent answer above you can use the following method to get a global timer independent of the subprocesses:

import multiprocessing as mp
import time

# create iterable
iterable = range(4)

# adds three to the given element
def add_3(num):
    a = num + 3
    return a

# multiprocessing attempt
def main():
    pool = mp.Pool(2)
    results = pool.map(add_3, iterable)    
    return results

if __name__ == "__main__":  #Required not to spawn deviant children    
    start=time.time()

    results = main()
    print(list(results))

    elapsed = (time.time() - start)  
    print("\n","time elapsed is :", elapsed)

Note that if we had instead used time.process_time() instead of time.time() we will get an undesired result.

Autorizzato sotto: CC-BY-SA insieme a attribuzione
Non affiliato a StackOverflow
scroll top