Question

I have UTF-8 encoded post titles which I'd rather show using the appropriate characters in slugs. An example is Amazon Japan's URL here.

How can any arbitrary string be converted to a safe URL slug such as this, with Ruby (or Rails)?

(There are some related PHP posts, but nothing I could find for Ruby.)

Was it helpful?

Solution

From reading here it seems like a solution is this:

require 'open-uri'
str = "\x12\x34\x56\x78\x9a\xbc\xde\xf1\x23\x45\x67\x89\xab\xcd\xef\x12\x34\x56\x78\x9a".force_encoding('ASCII-8BIT')
puts URI::encode(str)

Here is the documentation for open-uri. and here is some info on utf-8 encoded url schema.

EDIT: having looked into this more I noticed encode is just an alias for URI.escape which is documented here. example taken from the docs below:

require 'uri'

enc_uri = URI.escape("http://example.com/?a=\11\15")
p enc_uri
# => "http://example.com/?a=%09%0D"

p URI.unescape(enc_uri)
# => "http://example.com/?a=\t\r"

p URI.escape("@?@!", "!?")
# => "@%3F@%21"

Let me know if this is what you were looking for?

EDIT #2: I was interested and kept looking a little more, according to the comments ryan bates' railscasts on friendlyid also seems to work with chinese characters.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top