Why are URLs encoded in Base32?

https://stackoverflow.com/questions/11016772

14-06-2021
|

Question

This is a really short question I think but I'm not sure I understand the point of it.

Why are URLs encoded in Base32? What are the benefits of it and what are the drawbacks of it?

Solution

Sometimes URL data needs to be encoded to encapsulate things that aren't easily type-able, such as "ÓĆ", or even binary data that has no text representation at all. Putting that inside of a query string was problematic. Some servers don't understand Unicode text in a query string, though that situation is certainly getting better.

So the data needs to be encoded somehow that the server can interpret correctly, and the application knows how to use. Base32 is commonly used for that. It encodes any binary data into a ASCII text representation of that data. When the original data is needed, it is decoded.

So why not base64? Base64 will almost always have a shorter encoding length. Base64's weakness is that it uses both upper and lower case letter for encoding. There is a distinction between A and a. Whereas Base32 only uses one letter's casing, so it can be case insensitive. Generally (but not always), URLs are case insensitive, and using Base32 keeps that notion alive. This distinction is useful when the encoded data is meant to be typed, read aloud, etc.

The drawback to Base32 is that the resulting encoding is almost always longer due to a much smaller character set.

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow