質問

I was trying to do something interesting like:

http = Net::HTTP.new("t66y.com", 80)
request = Net::HTTP::Get.new("http://t66y.com/")
response = http.request(request)
puts response.inspect

it works fine, and give me <Net::HTTPOK 200 OK readbody=true>. However, after I changed url to something like http://t66y.com/thread0806.php?fid=16, it keep throwing EOFError exception to me. The whole log was:

/Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/protocol.rb:141:in `read_nonblock': end of file reached (EOFError)
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/protocol.rb:141:in `rbuf_fill'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/protocol.rb:92:in `read'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2779:in `ensure in read_chunked'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2779:in `read_chunked'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2750:in `read_body_0'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2710:in `read_body'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2735:in `body'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:2672:in `reading_body'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1321:in `block in transport_request'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1316:in `catch'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1316:in `transport_request'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1293:in `request'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1286:in `block in request'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:745:in `start'
    from /Users/lei/.rvm/rubies/ruby-1.9.3-p362/lib/ruby/1.9.1/net/http.rb:1284:in `request'
    from /Users/lei/workspace/Dadiaosi/scraper.rb:18:in `<top (required)>'
    from -e:1:in `load'
    from -e:1:in `<main>'

do you guys have any clue about that?

役に立ちましたか?

解決

These work:

In the terminal:

$ curl -v http://t66y.com/thread0806.php?fid=16

In ruby:

require 'open-uri'
response = open("http://t66y.com/thread0806.php?fid=16")
html = response.read

From the curl response I can see the headers and that the content-length is missing and the charset is Chinese. This might be tripping up the ruby net http library if you're on an older version of ruby.

You can easily swap in open-uri to get the html as shown above.

他のヒント

It should be

uri = URI('http://t66y.com/thread0806.php?fid=16')
response = Net::HTTP.get(uri)
ライセンス: CC-BY-SA帰属
所属していません StackOverflow
scroll top