Question

I am using Typhoeus with Hydra in order to make parallel requests . my end goal is to parse the typhoeus response into mechanize object.

url = "http://example.com/"
hydra = Typhoeus::Hydra.new
agent = Mechanize.new
request = Typhoeus::Request.new(url, :method => :get, :proxy => "#{proxy_host}:#{proxy_port}")
request.on_complete do |response|  #Typhoeus::response object
  body = response.body
  uri = request.parsed_uri
  page = agent.parse(uri, response, body)
end
hydra.queue(request)
hydra.run

the agent.parse method is giving me error because it cannot parse the typhoeus response object

/usr/local/rvm/gems/ruby-1.9.3-p194/gems/mechanize-2.5.1/lib/mechanize.rb:1165:in `parse': undefined method `[]' for #<Typhoeus::Response:0x00000012cd9da0> (NoMethodError)

Is there anyway i can convert Typhoeus response into Net::HTTPResponse object ? Or is there any other way I can club Mechanize and Typhoeus together? So that, I can make parallel requests with typhoeus and scrape the data with Mechanize library.

Was it helpful?

Solution

  1. I tried to create a Net::HTTPResponse(https://github.com/ruby/ruby/blob/trunk/lib/net/http/response.rb) from a Typhoeus::Response, but it didn't work out. Calling the initializer is easy, but setting the response body or headers not.

  2. I looked into mechanize to see if it can be changed to use Typhoeus for making requests but I don't think thats possible right now. Net/http is really hard-wired into mechanize. I thought of a mechanize-typhoeus adapter, which would be nice.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top