Question

I have a mystery to solve when upgrading our Rails3.2 Ruby 1.9 app to a Rails3.2 Ruby 2.1.2 one. Nokogiri seems to break, in that it changes its behavior using open-uri. No gem versions are changed, just the ruby version (this is all on OSX Mavericks, using brew, gcc4 etc).

Steps to reproduce:

$ ruby -v
ruby 1.9.3p484 (2013-11-22 revision 43786) [x86_64-darwin13.1.0]

$ rails console
Connecting to database specified by database.yml
Loading development environment (Rails 3.2.18)

> feed = Nokogiri::XML(open(URI.encode("http://anyblog.wordpress.org/feed/")))
=> #(Document:0x3fcb82f08448 {
  name = "document",
  children = [
  ..

> feed.xpath("//item").count
=> 10

So all good! Next, after a rvm change to Ruby 2.1.2 and a bundle install..

$ ruby -v
ruby 2.1.2p95 (2014-05-08 revision 45877) [x86_64-darwin13.0]

$ rails console
Connecting to database specified by database.yml
Loading development environment (Rails 3.2.18)

> feed = Nokogiri::XML(open(URI.encode("http://anyblog.wordpress.org/feed/")))
=> 

> feed.inspect
=> "#<Nokogiri::XML::Document:0x86a1f21c name=\"document\">"

> feed.xpath("//item").count
=> 0

So it looks like the 'open' encoding has changed, in that a gzip http stream isn't being fed correctly to nokogiri? I checked with a nokogiri -v and it is using the packaged xml libs rather than system ones. Is this a open-uri Ruby 2.1.2 issue?

Another theory is that one of the gems has monkey patched open-uri to fix something in 1.9 and that is breaking 2.1? Help or ideas please!

EDIT: Here's more info not using Nokogiri, i.e. thinking this is more a open-uri issue on Ruby 2.1.2:

> open(url) {|f|
*   f.each_line {|line| p line}
*   p f.content_type
*   p f.charset
*   p f.content_encoding
* }  
"\u001F\x8B\b\u0000\u0000\u0000\u0000\u0000\u0000\u0003\xED\x9D\xDBr\eW\xB2\xA6\xAF\xED\xA7\xA8\xCD\u001E\xB7/$\u0010..
(snip)
3\xF3\xA79\xA7\xFAɗ\xFF\u000F\xEAo\x9C\u0014k\xE8\u0000\u0000"
"text/xml"
"utf-8"
["gzip"]
=> ["gzip"]

..the 1.9 version was readable, i.e. gzip was applied already.

If I go into a clean ruby irb it works ok, so it must be something in my rails gems that is changing the behavior of open-uri open to not deflate/gzip. I have a lot of gems referenced.. :(

Was it helpful?

Solution

Ok, here's an answer, and maybe the answer. Ruby 2 changed how it uses headers in HTTP requests and zipping/deflating, but at some point they changed their minds back and put it to be how 1.9 worked. In the interim some Rails gem maintainers monkey patched HTTP:Net to make their gems work on both 1.9 and 2.0. Those monkey patches still linger in older versions of gems and cause issues like I saw upgrading from 1.9 to 2.1

A summary of the issue and solution here:

http://avi.io/blog/2013/12/17/do-not-upgrade-your-rails-project-to-ruby-2-before-you-read-this/

We use the gem right_aws, and the details of that issue with ruby versions is here:

https://github.com/sferik/twitter/issues/473

The solution was to undo the monkey patch using this as a gem reference in our Gemfile:

gem 'right_http_connection', git: 'git://github.com/rightscale/right_http_connection.git', ref: '3359524d81'

Background reading and more info:

https://github.com/rightscale/right_aws/issues/167

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top