Question

I have a postScript file with some Arabic text, and GhostScript does not render the text correctly. Even converting the postScript to PDF gives same result.

The PS file has the text readable and correct. The font used is Andalus, GhostScript finds the font with no problem, but renders it wrongly. We get unknown meaningless characters instead of the Arabic ones(like squares or symbols, BTW, same characters are shown when we convert the postScript file to PDF)

postScript snippet:

    /Andalus findfont 20 scalefont setfont
    100.00 xx  320.00 xx  moveto
    (WELCOME Mr. رانيا) show
    %%EndPage

I do not know what is wrong with it. I have tried so many different Arabic fonts, and no one of them worked. Is the problem with the way we are writing the Arabic text in postScript?

Any help is appreciated

Was it helpful?

Solution

You can't normally simply stick Arabic text into a string and expect to get sensible output without re-encoding the font.

PostScript strings are really just a string of bytes; for Latin fonts and text it so happens that the standard encoding is ASCII, so they match up and you can just write ASCII text. For almost any non-Latin language this won't work.

PostScript renders glyphs by a somewhat arcane but very flexible method, we'll ignore CIDFonts and type 0 fonts for now as they complicate matters.

When told to show a string, the interpreter gets each byte individually from the string, it then looks up the entry in the Encoding array in the font which is indexed by that byte value. That will give it a name object representing the particular glyph. It then looks up the CharStrings dictionary for that name, the result being a procedure which the interpreter runs, to draw the glyph.

Now for a simple piece of Arabic you can probably get away with a simple Encoding. Encoding arrays are limited to 255 entries, so you can't have more glyphs than that from a single instance of a font. If you need more than 255 characters then you need a more complex structure, a CIDFont.

Its up to you to re-encode the font so that the glyphs you want to use are at the Encoding positions that you want to reference them. I don't speak Arabic so I can't help you on that at all.

However I do know that Arabic glyphs can have up to 3 forms, initial, medial and terminal so you probably need several times more glyphs than a non-Arabic speaker might expect in order to cover the full range.

Rather than try and present a complete tutorial here I recommend you read the articles by John Deubert available at the Acumen Training site http://www.acumentraining.com/acumenjournal.html in particular the articles from November and December 2001 dealing with re-encoding fonts.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top