Unicode characters not showing properly on french jsp page
Question
I am using the UTF-16 code "\u2013"
in my java property file to display a dash in my page. The page is in French. Now at this link: http://www.fileformat.info/info/unicode/char/2013/index.htm I see that they say this is an 'en-dash'. What is an en-dash? A dash should be same in en and fr I think.
On the screen, it shows up as a question mark.
What am I missing here?
Solution
en dash is unrelated to english language. It's named after its length (or width): a dash that is 1en
large.
There's also the em dash, which width is 1em
.
1en
is the width of the letter n
; 1em
is the width of the letter m
. The former width is half the width of the latter (their name in french typography are resp. tiret demi-cadratin and tiret cadratin. Demi means half and tiret dash: it's quite clear that one is half the other).
Uses:
- "09:00 - 17:00" is a range and the dash should be an en dash
- "and Paris - the legendary figure of the Trojan War, not the capital of France - said to Hector" should use em dashes where parenthesis could've been used
Other facts:
em
is also a relative unit in CSS- respective HTML entities are
–
and—
.
As for the question mark displayed: is the font used able to display those glyphs? The dash/minus -
is OK if the font lacks any other dash.
OTHER TIPS
This is problem due to not using UTF-8, this is a format which support all languages characters. You can use such type of conversion while displaying any type of language text. For example:
To see the exact words of this string (Votre compte à été activé) we must convert it to UTF-8. after this it will be seeing as (Votre compte à été activé)
$text = 'Comment utiliser du texte français en php ex: Prénom';
$enc = mb_detect_encoding($text, "UTF-8,ISO-8859-1");
$changewords = iconv($enc, "UTF-8", $text);