Question
i have simple REST API written using python 2.7.2, bottle 0.10.9 and "swiss army knife " scrapy 0.14.1.
Briefly there is just one method (myserver:8081/doparse?address="url") that initiates scraping the url using scrapy and return response in json.
While deploying the script using bottle built-in server i get the following output :
Shutdown...
Traceback (most recent call last):
File "parser/main.py", line 67, in <module>
run(host='ks205512.kimsufi.com', port=8081)
File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 2391, in run
server.run(app)
File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 2089, in run
srv.serve_forever()
File "/usr/lib/python2.6/SocketServer.py", line 224, in serve_forever
r, w, e = select.select([self], [], [], poll_interval)
select.error: (4, 'Interrupted system call')
Using bottle with other servers like cherrypy instead doesn't help but produce other errors like:
Traceback (most recent call last):
File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 737, in _handle
return route.call(**args)
File "/usr/local/lib/python2.6/dist-packages/bottle.py", line 1456, in wrapper
rv = callback(*a, **ka)
File "parser/main.py", line 20, in parse
return parse_url(url)
File "parser/main.py", line 35, in parse_url
items = crawler.start(url) # launching crawler
File "/home/projects/linkedinparser/parser/crawler.py", line 140, in start
crawler = CrawlerWorker(LinkedinSpider(url), results)
File "/home/projects/linkedinparser/parser/crawler.py", line 85, in __init__
self.crawler = CrawlerProcess(settings)
File "/usr/local/lib/python2.6/dist-packages/scrapy/crawler.py", line 69, in __init__
install_shutdown_handlers(self._signal_shutdown)
File "/usr/local/lib/python2.6/dist-packages/scrapy/utils/ossignal.py", line 21, in install_shutdown_handlers
reactor._handleSignals()
File "/usr/local/lib/python2.6/dist-packages/twisted/internet/posixbase.py", line 292, in _handleSignals
_SignalReactorMixin._handleSignals(self)
File "/usr/local/lib/python2.6/dist-packages/twisted/internet/base.py", line 1129, in _handleSignals
signal.signal(signal.SIGINT, self.sigInt)
ValueError: signal only works in main thread
I would appreciate any kind of help. Thanks
Solution
The default reactor, by default, will install signal handlers to catch events like Ctrl-C, SIGTERM, and so on. However, you can't install signal handlers from non-main threads in Python, which means that reactor.run() will cause an error. Pass the installSignalHandlers=0
keyword argument to reactor.run
to work around this.