Python Scrapy , how to define a pipeline for an item?
-
20-08-2019 - |
Question
I am using scrapy to crawl different sites, for each site I have an Item (different information is extracted)
Well, for example I have a generic pipeline (most of information is the same) but now I am crawling some google search response and the pipeline must be different.
For example:
GenericItem
uses GenericPipeline
But the GoogleItem
uses GoogleItemPipeline
, but when the spider is crawling it tries to use GenericPipeline
instead of GoogleItemPipeline
....how can I specify which pipeline Google spider must use?
Solution
Now only one way - check Item type in pipeline and process it or return "as is"
pipelines.py:
from grabbers.items import FeedItem
class StoreFeedPost(object):
def process_item(self, domain, item):
if isinstance(item, FeedItem):
#process it...
return item
items.py:
from scrapy.item import ScrapedItem
class FeedItem(ScrapedItem):
pass
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow