문제

I'm faced with a page which consists of multiple H2 tags and I require all those titles to be stored on separate rows in my csv sheet. I'm using scrapy for this and my current code is :

item ["title"] = titles.select("//h2/text()").extract()

Obviously, this ends up storing all the h2 tags of that page into one single cell in my csv.

Is there any way by which I can have a break after it scrapes each h2 tag?

Thanks

도움이 되었습니까?

해결책

You can loop on each h2 and create an Item per h2, setting the "title" for each:

    items = []
    for title in titles.select("h2"):

        item = MyItem()

        # note the relative XPath expression (starting with "./")
        item["title"] = title.select("./text()").extract()

        items.append(item)

    return items
라이센스 : CC-BY-SA ~와 함께 속성
제휴하지 않습니다 StackOverflow
scroll top