I have Arabic PDF Files and it seems that there are something wrong in its encoding .
When I try to search in the PDF for word inside it , it didn't find results
when I try to export the pdf contents to Excel using other programs it export data in a strange encoding
When I copy the data in the PDF to notepad , Notepad display strange encoding.
I am developing solution which will use these PDFs (about 950 file) so I must found a way to fix encoding.
Thanks in Advance
Disclaimer: I've never edited an Arabic file.
How did you export the .pdf contents to Excel?
You cannot directly open a .pdf file neither with Word/Excel/Wordpad nor Notepad, that strange encoding you're seeing most probably is the specific encoding of a selected font resource.
You can use this this tool to detect the encoding
but I really advise you to read the bare minimum about Unicode and Character Sets
From then on, considering the amount of files involved, a good solution seems to be PyODConverter
For a smaller amount of files, Free PDF to Word Converter will take care of your needs:
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With