Is there a formally documented encoding scheme (like Base64) that does not include visually similar characters?

https://stackoverflow.com/questions/18543514

26-06-2022
|

Question

I'm writing up a formal proposal. Part of it requires creating completely random UUIDs (for privacy reasons) and encoding them into a compressed human-readable/writable format, like Base64.

However, Base64 permits variants of visually confusable characters; I want the encoding to e.g. permit only one of [number 1, lowercase and uppercase letter i, and lower case L] and only one of [number 0, lowercase and uppcase letter O].

Does there already exist such an encoding (formally documented)? I know it's more or less trivial to create a new one that does this, but I would prefer to make reference to an extant standard if possible

Solution

Yes, the keys would need to be twice the length with base 32.

Base32 is a notation for encoding arbitrary byte data using a restricted set of symbols which can be conveniently used by humans and processed by old computer systems which only recognize restricted character sets.

http://en.wikipedia.org/wiki/Base32

OTHER TIPS

/via G+ Cory Schmunsler: http://tantek.pbworks.com/w/page/19402946/NewBase60

(This isn't exactly an official RFC type encoding, though.)

Licensed under: CC-BY-SA with attribution

Not affiliated with StackOverflow