Question

Perl 5 has a module on CPAN named Text::Unidecode that transliterates Unicode into ASCII. So, for instance, if you hand it the string "“北亰 — it’s the best”" it hands back the string "\"Bei Jing -- it's the best\"". A quick search for Java libraries to do the same thing only turned up code that would strip Unicode characters or turn accented characters into non-accented characters.

Does anyone know of a Java library that produces similar output to Text::Unidecode?

Was it helpful?

Solution

A quick Google says: http://junidecode.sourceforge.net/ - but looks like it hasn't been updated for a while.

OTHER TIPS

There is another library for Java: unidecode.

Use with Gradle:

compile 'cz.jirutka.unidecode:unidecode:1.0.1'

Use with Maven:

<dependency>
    <groupId>cz.jirutka.unidecode</groupId>
    <artifactId>unidecode</artifactId>
    <version>1.0.1</version>
</dependency>
Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top