Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Converting rtf to pdf using python

Tags:

python

pdf

rtf

I am new to the python language and I am given a task to convert rtf to pdf using python. I googled and found some code- (not exactly rtf to pdf) but I tried working on it and changed it according to my requirement. But I am not able to solve it.

I have used the below code:

import sys
import os
import comtypes.client
#import win32com.client
rtfFormatPDF = 17

in_file = os.path.abspath(sys.argv[1])
out_file = os.path.abspath(sys.argv[2])

rtf= comtypes.client.CreateObject('Rtf.Application')

rtf.Visible = True
doc = rtf.Documents.Open(in_file)
doc.SaveAs(out_file, FileFormat=rtfFormatPDF)
doc.Close()
rtf.Quit()

But its throwing the below error

Traceback (most recent call last):
  File "C:/Python34/Lib/idlelib/rtf_to_pdf.py", line 12, in <module>
    word = comtypes.client.CreateObject('Rtf.Application')
  File "C:\Python34\lib\site-packages\comtypes\client\__init__.py", line 227, in CreateObject
    clsid = comtypes.GUID.from_progid(progid)
  File "C:\Python34\lib\site-packages\comtypes\GUID.py", line 78, in from_progid
    _CLSIDFromProgID(str(progid), byref(inst))
  File "_ctypes/callproc.c", line 920, in GetResult
OSError: [WinError -2147221005] Invalid class string

Can anyone help me with this? I would really appreciate if someone can find the better and fast way of doing it. I have around 200,000 files to convert.

Anisha

like image 671
ani Avatar asked Apr 14 '15 21:04

ani


2 Answers

I used Marks's advice and changed it back to Word.Application and my source pointing to rtf files. Works perfectly! - the process was slow but still faster than the JAVA application which my team was using. I have attached the final code in my question.

Final Code: Got it done using the code which works with Word application :

import sys
import os,os.path
import comtypes.client

wdFormatPDF = 17

input_dir = 'input directory'
output_dir = 'output directory'

for subdir, dirs, files in os.walk(input_dir):
    for file in files:
        in_file = os.path.join(subdir, file)
        output_file = file.split('.')[0]
        out_file = output_dir+output_file+'.pdf'
        word = comtypes.client.CreateObject('Word.Application')

        doc = word.Documents.Open(in_file)
        doc.SaveAs(out_file, FileFormat=wdFormatPDF)
        doc.Close()
        word.Quit()
like image 106
ani Avatar answered Sep 24 '22 08:09

ani


If you have Libre Office in your system, you got the best solution.

import os
os.system('soffice --headless --convert-to pdf filename.rtf')
# os.system('libreoffice --headless -convert-to pdf filename.rtf')
# os.system('libreoffice6.3 --headless -convert-to pdf filename.rtf')

Commands may vary to different versions and platforms. But this would be the best solution ever I had.

like image 30
Kuppusamy Avatar answered Sep 22 '22 08:09

Kuppusamy