Adding encoding in postscript, ghostscript renders text correctly, but converting to PDF does not show the characters

StackOverflow https://stackoverflow.com/questions/22557423

  •  18-06-2023
  •  | 
  •  

Question

We have to construct a postscript file that contains Arabic text, so as English text.

GhostScript shows the Arabic text correctly, but converting it to pdf does not show the Arabic letters.

PS file contains the following:

    /TraditionalArabic findfont dup
    length dict
    copy begin

    /Encoding Encoding 256 array copy def
    Encoding 1 /kafinitialarabic put
    Encoding 2 /behinitialarabic put
    Encoding 3 /yehmedialarabic put
    Encoding 4 /seenfinalarabic put
    Encoding 5 /eacute put
    Encoding 6 /a put

    /ArabicTradDict currentdict definefont pop

    end
    %%Page: 1 1
    %%BeginPageSetup
    %%PageMedia: Color Weight Type
    << /MediaColor (Blue)/MediaWeight 75 /MediaType () /xx {2.803464567 mul} def /xx           {2.83464567 mul} def  /PageSize [240 xx 345 xx]>> setpagedevice
    %%EndPageSetup
    /ArabicTradDict  18 selectfont

    72 xx 300 xx moveto
    (\004\003\002\001) show
    showpage

To run ghostScript: running it from command line to include all windows fonts:

    gswin64.exe -sFONTPATH=%windir%/fonts -dEmbedAllFonts=true

To convert the PS file to PDF file: running the following command:

   gswin64.exe -dBATCH -dNOPAUSE -         sOutputFile=c:/Users/mob/Desktop/TimesNewRomanPSMT.pdf -sDEVICE=pdfwrite -  dPDFSETTINGS=/prepress -dCompressFonts=false -dSubsetFonts=false -sFONTPATH=%windir%/fonts -dEmbedAllFonts=true -dEmbedAllFonts=true -f c:/Users/mob/Desktop/TimesNewRomanPSMT.ps

So when converting to PDF, the Arabic characters are not showing correctly, but showing as squares that are of no meaning...

If I use Adobe tool to convert to PDF, the PDF we get is same, except the "eacute -(005) " if included in the PS file, will show after conversion, where as when I convert with the previous command line, all characters that are added from the Encoding are not shown correctly.

Any help with that?

Was it helpful?

Solution

Thanks to KenS hints I was able to solve my problem. The encoding used wrong character names like kafinitialarabic (i mean by wrong, pdf could not understand that), everything that ended with arabic was wrong. The Traditional Arabic font does not have those names for characters. In order to know what it really understood, have converted the ttf font to afm and pfa using the following command, that is converting the true type font to type 42 font which will be understood once embed in postscript file at conversion to pdf

    C:\Program Files\gs\gs9.10\bin>gswin64c.exe -dNODISPLAY -q -- ttf2pf.ps times tim
    esPS timesAFM

where times is the ttf font name. I then checked the generated pfa file for the characters I wanted to add, instead of kafinitialarabic, there was kafinitial, and for kafmedialarabic there was kafmedial and so on...

It works fine now to add those in encoding, but I want to find a way instead of adding all those characters in the dictionary, I want to use the font like we use with setfont in postscript normally - if that is possible...

OTHER TIPS

As already suggested, you need to ensure the glyph names you use are in the font you use, or create a new font.

I haven't found anything that will choose the correct glyph from the set of initial, medial, final, isolated, depending on context, though.

I resorted to writing a program which takes unicode arabic, reverses it the arabic characters, and then decides which tone of character to use based on it's position in a word, and whether the previous or next characters are forced into isolated or final forms. Unfortunately had to embed quite some intrinsic knowledge about the font in use and the glyph names it has, as well as typos in them, into the program.

If that's of interest, I've stuck it on github, but it's very raw and initial. It does work, though.

https://github.com/gbjk/arabic2ps

The font I used was a traditional arabic font, with quite a few idiosyncrasies.

Licensed under: CC-BY-SA with attribution
Not affiliated with StackOverflow
scroll top