Fix PDF encoding [closed]

Question

I have Arabic PDF Files and it seems that there are something wrong in its encoding .

When I try to search in the PDF for word inside it , it didn't find results

when I try to export the pdf contents to Excel using other programs it export data in a strange encoding

When I copy the data in the PDF to notepad , Notepad display strange encoding.

I am developing solution which will use these PDFs (about 950 file) so I must found a way to fix encoding.

Thanks in Advance

Joao Figueiredo · Accepted Answer

Disclaimer: I've never edited an Arabic file.

How did you export the .pdf contents to Excel?

You cannot directly open a .pdf file neither with Word/Excel/Wordpad nor Notepad, that strange encoding you're seeing most probably is the specific encoding of a selected font resource.

You can use this this tool to detect the encoding

but I really advise you to read the bare minimum about Unicode and Character Sets

From then on, considering the amount of files involved, a good solution seems to be PyODConverter

For a smaller amount of files, Free PDF to Word Converter will take care of your needs:

Fix PDF encoding [closed]

Tags:

character-encoding

pdf

M_1100

1 Answers

Joao Figueiredo

Recent Activity

Donate For Us

Fix PDF encoding [closed]

Tags:

character-encoding

pdf

M_1100

1 Answers

Joao Figueiredo

Related questions

Recent Activity

Donate For Us