键盘中断python的多处理池
-
05-07-2019 - |
题
如何使用python的多处理池处理KeyboardInterrupt事件?这是一个简单的例子:
from multiprocessing import Pool
from time import sleep
from sys import exit
def slowly_square(i):
sleep(1)
return i*i
def go():
pool = Pool(8)
try:
results = pool.map(slowly_square, range(40))
except KeyboardInterrupt:
# **** THIS PART NEVER EXECUTES. ****
pool.terminate()
print "You cancelled the program!"
sys.exit(1)
print "\nFinally, here are the results: ", results
if __name__ == "__main__":
go()
当运行上面的代码时,当我按 ^ C
时会出现 KeyboardInterrupt
,但是这个过程只是挂起,我必须在外部杀掉它。
我希望能够随时按 ^ C
并使所有进程正常退出。
解决方案
这是一个Python错误。在等待threading.Condition.wait()中的条件时,从不发送KeyboardInterrupt。 REPRO:
import threading
cond = threading.Condition(threading.Lock())
cond.acquire()
cond.wait(None)
print "done"
在wait()返回之前,不会传递KeyboardInterrupt异常,并且它永远不会返回,因此中断永远不会发生。 KeyboardInterrupt几乎肯定会中断条件等待。
请注意,如果指定超时,则不会发生这种情况; cond.wait(1)将立即收到中断。因此,解决方法是指定超时。为此,请替换
results = pool.map(slowly_square, range(40))
与
results = pool.map_async(slowly_square, range(40)).get(9999999)
或类似。
其他提示
根据我最近发现的,最好的解决方案是将工作进程设置为完全忽略SIGINT,并将所有清理代码限制在父进程中。这解决了空闲和繁忙工作进程的问题,并且不需要子进程中的错误处理代码。
import signal
...
def init_worker():
signal.signal(signal.SIGINT, signal.SIG_IGN)
...
def main()
pool = multiprocessing.Pool(size, init_worker)
...
except KeyboardInterrupt:
pool.terminate()
pool.join()
可在 http://noswap.com/blog/找到解释和完整示例代码。 python-multiprocessing-keyboardinterrupt / 和 http://github.com/jreese/multiprocessing-keyboardinterrupt 分别。
由于某些原因,只能正常处理从基本 Exception
类继承的异常。作为解决方法,您可以将 KeyboardInterrupt
重新引发为 Exception
实例:
from multiprocessing import Pool
import time
class KeyboardInterruptError(Exception): pass
def f(x):
try:
time.sleep(x)
return x
except KeyboardInterrupt:
raise KeyboardInterruptError()
def main():
p = Pool(processes=4)
try:
print 'starting the pool map'
print p.map(f, range(10))
p.close()
print 'pool map complete'
except KeyboardInterrupt:
print 'got ^C while pool mapping, terminating the pool'
p.terminate()
print 'pool is terminated'
except Exception, e:
print 'got exception: %r, terminating the pool' % (e,)
p.terminate()
print 'pool is terminated'
finally:
print 'joining pool processes'
p.join()
print 'join complete'
print 'the end'
if __name__ == '__main__':
main()
通常你会得到以下输出:
staring the pool map
[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
pool map complete
joining pool processes
join complete
the end
因此,如果您点击 ^ C
,您将获得:
staring the pool map
got ^C while pool mapping, terminating the pool
pool is terminated
joining pool processes
join complete
the end
通常这个简单的结构适用于池上的 Ctrl - C :
def signal_handle(_signal, frame):
print "Stopping the Jobs."
signal.signal(signal.SIGINT, signal_handle)
如同几篇类似的帖子所述:
似乎有两个问题在多处理烦人时会产生异常。第一个(由Glenn指出)是你需要使用 map_async
而不是 map
来获得立即响应(即,不要完成处理整个清单)。第二个(由Andrey指出)是多处理不捕获不从 Exception
继承的异常(例如, SystemExit
)。所以这是我的解决方案,处理这两个:
import sys
import functools
import traceback
import multiprocessing
def _poolFunctionWrapper(function, arg):
"""Run function under the pool
Wrapper around function to catch exceptions that don't inherit from
Exception (which aren't caught by multiprocessing, so that you end
up hitting the timeout).
"""
try:
return function(arg)
except:
cls, exc, tb = sys.exc_info()
if issubclass(cls, Exception):
raise # No worries
# Need to wrap the exception with something multiprocessing will recognise
import traceback
print "Unhandled exception %s (%s):\n%s" % (cls.__name__, exc, traceback.format_exc())
raise Exception("Unhandled exception: %s (%s)" % (cls.__name__, exc))
def _runPool(pool, timeout, function, iterable):
"""Run the pool
Wrapper around pool.map_async, to handle timeout. This is required so as to
trigger an immediate interrupt on the KeyboardInterrupt (Ctrl-C); see
http://stackoverflow.com/questions/1408356/keyboard-interrupts-with-pythons-multiprocessing-pool
Further wraps the function in _poolFunctionWrapper to catch exceptions
that don't inherit from Exception.
"""
return pool.map_async(functools.partial(_poolFunctionWrapper, function), iterable).get(timeout)
def myMap(function, iterable, numProcesses=1, timeout=9999):
"""Run the function on the iterable, optionally with multiprocessing"""
if numProcesses > 1:
pool = multiprocessing.Pool(processes=numProcesses, maxtasksperchild=1)
mapFunc = functools.partial(_runPool, pool, timeout)
else:
pool = None
mapFunc = map
results = mapFunc(function, iterable)
if pool is not None:
pool.close()
pool.join()
return results
投票的答案不是解决核心问题,而是类似的副作用。
多处理库的作者Jesse Noller解释了如何在旧的 multiprocessing.Pool 时正确处理CTRL + C. blog / 2009/01/08 / multiprocessingpool-and-keyboardinterrupt“rel =”nofollow noreferrer“>博客文章。
import signal
from multiprocessing import Pool
def initializer():
"""Ignore CTRL+C in the worker process."""
signal.signal(signal.SIGINT, signal.SIG_IGN)
pool = Pool(initializer=initializer)
try:
pool.map(perform_download, dowloads)
except KeyboardInterrupt:
pool.terminate()
pool.join()
我发现,目前最好的解决方案是不使用multiprocessing.pool功能,而是推出自己的池功能。我提供了一个使用apply_async演示错误的示例,以及一个示例,说明如何完全避免使用池功能。
http://www.bryceboe.com/ 2010/8月26日/蟒-多 - 和 - 一个KeyboardInterrupt /
我是Python的新手。我到处寻找答案,偶然发现了这个以及其他一些博客和YouTube视频。我试图复制粘贴上面的作者的代码,并在我的python 2.7.13在Windows 7 64位中重现它。它接近我想达到的目标。
我让我的子进程忽略ControlC并使父进程终止。看起来绕过子进程确实可以避免这个问题。
#!/usr/bin/python
from multiprocessing import Pool
from time import sleep
from sys import exit
def slowly_square(i):
try:
print "<slowly_square> Sleeping and later running a square calculation..."
sleep(1)
return i * i
except KeyboardInterrupt:
print "<child processor> Don't care if you say CtrlC"
pass
def go():
pool = Pool(8)
try:
results = pool.map(slowly_square, range(40))
except KeyboardInterrupt:
pool.terminate()
pool.close()
print "You cancelled the program!"
exit(1)
print "Finally, here are the results", results
if __name__ == '__main__':
go()
从 pool.terminate()
开始的部分似乎永远不会执行。
您可以尝试使用Pool对象的apply_async方法,如下所示:
import multiprocessing
import time
from datetime import datetime
def test_func(x):
time.sleep(2)
return x**2
def apply_multiprocessing(input_list, input_function):
pool_size = 5
pool = multiprocessing.Pool(processes=pool_size, maxtasksperchild=10)
try:
jobs = {}
for value in input_list:
jobs[value] = pool.apply_async(input_function, [value])
results = {}
for value, result in jobs.items():
try:
results[value] = result.get()
except KeyboardInterrupt:
print "Interrupted by user"
pool.terminate()
break
except Exception as e:
results[value] = e
return results
except Exception:
raise
finally:
pool.close()
pool.join()
if __name__ == "__main__":
iterations = range(100)
t0 = datetime.now()
results1 = apply_multiprocessing(iterations, test_func)
t1 = datetime.now()
print results1
print "Multi: {}".format(t1 - t0)
t2 = datetime.now()
results2 = {i: test_func(i) for i in iterations}
t3 = datetime.now()
print results2
print "Non-multi: {}".format(t3 - t2)
输出:
100
Multiprocessing run time: 0:00:41.131000
100
Non-multiprocessing run time: 0:03:20.688000
此方法的一个优点是在中断之前处理的结果将在结果字典中返回:
>>> apply_multiprocessing(range(100), test_func)
Interrupted by user
{0: 0, 1: 1, 2: 4, 3: 9, 4: 16, 5: 25}
奇怪的是,您似乎还必须处理子代中的 KeyboardInterrupt
。我希望这可以按照书面形式工作...尝试将 slowly_square
更改为:
def slowly_square(i):
try:
sleep(1)
return i * i
except KeyboardInterrupt:
print 'You EVIL bastard!'
return 0
这应该按照你的预期工作。