Question

I can't seem to find any information about this in the documentation, the function that is used all the time is FT_Get_Char_Index(ftFace, i);, and it only takes in 1 unicode code point. But how about combined characters, the ones that take more than one code points?

I understand that some of these characters could be simply unicode-normalized to be represented as another (single) code point, but there are some characters that can't, right? How do we deal with those?

I am making a general-purpose text editing software, and would like it to support all of the unicode. But maybe this problem with multi-codepoint characters so small that it is not worth the trouble? Aren't such characters pretty common in some major asian languages?

If it cannot be properly done with freetype, how should one do it?

Was it helpful?

Solution

If I understand correctly, you won't be able to do what you want without additionally employing some kind of layout library. A layout library uses supplemental information in OpenType fonts for positioning of combining marks (among many other things). So for example, let's say you have the sequence U+0041, U+0301 (Latin Uppercase A, Combining Acute Mark). In general you can't just slap these two together with their default positioning within the rendering space, because the Acute Mark will collide with the shape of the 'A'.

A layout library will analyze the font's OpenType layout tables for this sequence and return positioning information that can be used to properly position each item of the sequence. So for example you give it the sequence, it might say, "leave the 'A' where it is; shift the acute 50 units rightward and 90 units upward" (it's considerably more complicated than this, but hopefully gives you the idea).

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top