I have a base pdf file, and want to update the title into Chinese (UTF-8) using ghostscript and pdfmark, command like below
gs -dBATCH -dNOPAUSE -sDEVICE=pdfwrite -sOutputFile=result.pdf base.pdf pdfmarks
And the pdfmarks file (encoding is UTF-8 without BOM) is below
[ /Title (敏捷开发)
/Author (Larry Cai)
/Producer (xdvipdfmx (0.7.8))
/DOCINFO pdfmark
The command is successfully executed, while when I check the properties of the result.pdf
The title is changed to æŁ‘æ“·å¼•å‘
Please give me hints how to solve this, are there any parameters in gs
command or pdfmark?
The PDF Reference states that the Title entry in the document info dictionary is of type 'text string'. Text strings are defined as using either PDFDocEncoding or UTF-16BE with a Byte Order Mark (see page 158 of the 1.7 PDF Reference Manual).
So you cannot specify a Title using UTF-8 without a BOM.
I would imagine that if you replace the Title string with a string defining the content using UTF-16BE with a BOM then it will work properly. I would suggest you use a hex string rather than a regular PostScript string to specify the data, simply for ease of use.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With