Question

I'm trying to download the HTML content from a URL without success.

Here is the URL:

http://example.com/some_string[value]

When use RestClient I get this error:

URI::InvalidURIError: bad URI(is not URI?)

I got some help from the Ruby on Rails IRC. The Idea is to escape the end of the URL.

$ "http://example.com/" + CGI::escape("some_string[value]")
=> "http://example.com/some_string%5Bvalue%5D"

The generated URL does not work, I'm getting a 404. It works in the browsers though.

Anyone knows how to get it to work?

Was it helpful?

Solution

According to the URI RFC:

Other characters are excluded because gateways and other transport agents are known to sometimes modify such characters, or they are used as delimiters.

unwise = "{" | "}" | "|" | "\" | "^" | "[" | "]" | "`"

Data corresponding to excluded characters must be escaped in order to be properly represented within a URI.

Trusting a browser's response or ability to handle a link is risky. They do everything they can to return a page, instead of enforcing the standards, so they are not authoritative sources whether a page or URL is correctly defined.

RestClient's response is probably based on URI's, which returned the same error when I tested parsing the URL using URI.

I haven't ever seen a URL using unencoded "[" and "]" characters.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top