Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Changing the font in a PDF

Tags:

python

pdf

fonts

I have a repository of PDF documents, and most of the text contained in these documents are formatted in Comic Sans. I would like to change this to something similar to Arial. The original font is embedded in the document. I haven't found any existing tool to do this for me (I'm on Linux), and I wonder if it's possible to do it programmaticaly. A Python library would be perfect, but a library in any programming language would do.

In which library will I be able to substitute fonts with the least effort? And which parts of the API would I use?

like image 434
user1390113 Avatar asked Nov 13 '22 16:11

user1390113


1 Answers

There are commercial tools that can do this - one of which is pdfToolbox from callas software (warning - I'm affiliated with this company).

However - even though this functionality exists and is sometimes used - the results are often completely undesirable and I have not seen many contexts where it is used on more than very specific files. And usually with limited success. To the point where this replacement is only available as a manual operation in the tool I mentioned - and not in automatic mode.

Depending on how complex these files are, you would probably have better success to extract all text from the documents into something like RTF, do whatever manipulation you need to do there and regenerate PDF afterwards. Sounds like a roundabout way but I'm guessing the result will be better in most cases...

like image 166
David van Driessche Avatar answered Nov 15 '22 06:11

David van Driessche