Question

Does anyone ever see a lot of errors like this:

Exception `Net::HTTPBadResponse' at /usr/lib/ruby/1.8/net/http.rb:2022
- wrong status line: _SOME HTML CODE HERE_

When using threads and mechanize? I'm relatively certain that this is some bad behavior between threads and the net/http library, but does anyone have any advice as far as the upper limit of threads you want to run at once when using mechanize/nethttp? And how can I capture this kind of exception because rescue Net::HTTPBadResponse doesn't work?

Was it helpful?

Solution

This could be something non-thread-safe in Mechanize, but I can think of other bugs that might cause the same problem. I'd start by disabling persistent connections, if you're using them. The next thing to do is to look at your code, and make sure that you're being careful with the objects you handle. If your application has multiple threads mucking about with common objects, that can break a library that would be otherwise thread-safe.

If there is a threading problem somewhere, the upper limit of threads you can use safely is 1. Any more, and you're just making a trade-off about how often you want the problem to occur, rather than whether it occurs or not.

OTHER TIPS

Based on my grueling experience this evening trying to get two Mechanize-based tasks run in tandem in Event Machine and this somewhat ancient exchange, no, it seems it is not thread-safe.

According to this email by Aaron Patterson himself, if you don't share an agent between threads, you should be OK.

IMHO, this means Mechanize is not thread-safe.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top