Question

I'm trying to write a parser for m-ati.su by using of scrapy. At the first step I have to get values and textfields from comboboxes with names "From" and "To" for different cities. I looked request at firebug and wrote

class spider(BaseSpider):
    name = 'ati_su'
    start_urls = ['http://m-ati.su/Tables/Default.aspx?EntityType=Load']
    allowed_domains = ["m-ati.su"]

    def parse(self, response):
        yield FormRequest('http://m-ati.su/Services/ATIGeoService.asmx/GetGeoCompletionList', 
                        callback=self.ati_from, 
                        formdata={'prefixText': 'moscow', 'count': '10','contextKey':'All_0$Rus'})
    def ati_from(self, response):
        json = response.body
        open('results.txt', 'wb').write(json)

And I have "500 Internal Server Error" for this request. What did I do wrong? Sorry for bad english. Thanks

Was it helpful?

Solution

I think you may have to add a X-Requested-With: XMLHttpRequest header to your POST request, so you can try this:

    def parse(self, response):
        yield FormRequest('http://m-ati.su/Services/ATIGeoService.asmx/GetGeoCompletionList', 
                          callback=self.ati_from, 
                          formdata={'prefixText': 'moscow', 'count': '10','contextKey':'All_0$Rus'},
                          headers={"X-Requested-With": "XMLHttpRequest"})

Edit: I tried running the spider and came with this:

(the request body is JSON encoded when I inspect it with Firefox so I used Request and forcing "POST" method, and the response I got was endoded in "windows-1251")

from scrapy.spider import BaseSpider
from scrapy.http import Request
import json

class spider(BaseSpider):
    name = 'ati_su'
    start_urls = ['http://m-ati.su/Tables/Default.aspx?EntityType=Load']
    allowed_domains = ["m-ati.su"]

    def parse(self, response):
        yield Request('http://m-ati.su/Services/ATIGeoService.asmx/GetGeoCompletionList',
                      callback=self.ati_from,
                      method="POST",
                      body=json.dumps({
                            'prefixText': 'moscow',
                            'count': '10',
                            'contextKey':'All_0$Rus'
                      }),
                      headers={
                            "X-Requested-With": "XMLHttpRequest",
                            "Accept": "application/json, text/javascript, */*; q=0.01",
                            "Content-Type": "application/json; charset=utf-8",
                            "Pragma": "no-cache",
                            "Cache-Control": "no-cache",
                      })
    def ati_from(self, response):
        jsondata = response.body
        print json.loads(jsondata, encoding="windows-1251")
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top