Question

If there is a long url, i want to generate a short url like those in twitter, is there some way to implement in ruby?

Thank you in advance.

Was it helpful?

Solution

The easiest way is to:

  1. keep a database of all URLs
  2. when you insert a new URL into the database, find out the id of the auto-incrementing integer primary key.
  3. encode that integer into base 36 or 62 (digits + lowercase alpha or digits + mixed-case alpha). Voila! You have a short url!

Encoding to base 36/decoding from base 36 is simple in Ruby:

12341235.to_s(36)
#=> "7cik3"

"7cik3".to_i(36)
#=> 12341235

Encoding to base 62 is a bit tricker. Here's one way to do it:

module AnyBase
  ENCODER = Hash.new do |h,k|
    h[k] = Hash[ k.chars.map.with_index.to_a.map(&:reverse) ]
  end
  DECODER = Hash.new do |h,k|
    h[k] = Hash[ k.chars.map.with_index.to_a ]
  end
  def self.encode( value, keys )
    ring = ENCODER[keys]
    base = keys.length
    result = []
    until value == 0
      result << ring[ value % base ]
      value /= base
    end
    result.reverse.join
  end
  def self.decode( string, keys )
    ring = DECODER[keys]
    base = keys.length
    string.reverse.chars.with_index.inject(0) do |sum,(char,i)|
      sum + ring[char] * base**i
    end
  end
end

...and here it is in action:

base36 = "0123456789abcdefghijklmnopqrstuvwxyz"
db_id = 12341235
p AnyBase.encode( db_id, base36 )
#=> "7cik3"
p AnyBase.decode( "7cik3", base36 )
#=> 12341235

base62 = [ *0..9, *'a'..'z', *'A'..'Z' ].join
p AnyBase.encode( db_id, base62 )
#=> "PMwb"
p AnyBase.decode( "PMwb", base62 )
#=> 12341235

Edit

If you want to avoid URLs that happen to be English words (for example, four-letter swear words) you can use a set of characters that does not include vowels:

base31 = ([*0..9,*'a'..'z'] - %w[a e i o u]).join
base52 = ([*0..9,*'a'..'z',*'A'..'Z'] - %w[a e i o u A E I O U]).join

However, with this you still have problems like AnyBase.encode(328059,base31) or AnyBase.encode(345055,base31) or AnyBase.encode(450324,base31). You may thus want to avoid vowel-like numbers as well:

base28 = ([*'0'..'9',*'a'..'z'] - %w[a e i o u 0 1 3]).join
base49 = ([*'0'..'9',*'a'..'z',*'A'..'Z'] - %w[a e i o u A E I O U 0 1 3]).join

This will also avoid the problem of "Is that a 0 or an O?" and "Is that a 1 or an I?".

OTHER TIPS

I use the bitly gem. It's really simple.

gem install bitly

# Use api version 3 or get a deprecation warning
Bitly.use_api_version_3

# Create a client
bitly = Bitly.new(username, api_key)

# Call method shorten
bitly.shorten('http://www.google.com').short_url

Well you could generate short URLs by using the APIs of the so many url shortening services. Almost all of the services that are out there provide an API for you to be able to invoke and shorten the url's, that is exactly how the twitter clients do as well. You should check out the particular url shortening service's website for further details.

If you want to create such a service on your own, that also can be quite simple, all you need to do is essentially maintain an internal mapping (in a database) between the original long url and a special short url (generated by you). And when you receive a request for a particular short url, you should then be able to get the original long url from the database and redirect the user to the same.

For Ruby 2.0, replace decode method by:

def self.decode( string, keys )
  ring = DECODER[keys]
  base = keys.length
  string.reverse.chars.map.with_index.inject(0) do |sum,(char,i)|
    sum + ring[char] * base**i
  end
end
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top