Scrapy seems to be pulling the data out correctly, but is formatting the output in my JSON object as if it were an array:

[{"price": ["$34"], "link": ["/product/product..."], "name": ["productname"]},
{"price": ["$37"], "link": ["/product/product"]...

My spider class looks like this:

def parse(self, response):
    sel = Selector(response)
    items = sel.select('//div/ul[@class="product"]')
    skateboards = []
    for item in items:
        skateboard = SkateboardItem()
        skateboard['name'] = item.xpath('li[@class="desc"]//text()').extract()
        skateboard['price'] = item.xpath('li[@class="price"]"]//text()[1]').extract()
        skateboard['link'] = item.xpath('li[@class="image"]').extract()
        skateboards.append(skateboard)
    return skateboards

How would I go about ensuring that Scrapy is only outputting a single value for each key, rather than the array it's currently producing?

有帮助吗?

解决方案

.extract()  

always returns a list you can use

''.join(item.xpath('li[@class="desc"]//text()').extract())

to get a string

其他提示

Use:
1 .extract_first() or
2 .extract()[0]

to get data in string format.

PS: using Scrapy 1.2

许可以下: CC-BY-SA归因
不隶属于 StackOverflow
scroll top