I have been trying to find the efficient way to convert document e.g. doc, docx, ppt, pptx to pdf. So far i have tried docsplit and oowriter, but both took > 10 seconds to complete the job on pptx file having size 1.7MB. Can any one suggest me a better way or suggestions to improve my approach?
What i have tried:
from subprocess import Popen, PIPE import time  def convert(src, dst):     d = {'src': src, 'dst': dst}     commands = [         '/usr/bin/docsplit pdf --output %(dst)s %(src)s' % d,         'oowriter --headless -convert-to pdf:writer_pdf_Export %(dst)s %(src)s' % d,     ]      for i in range(len(commands)):         command = commands[i]         st = time.time()         process = Popen(command, stdout=PIPE, stderr=PIPE, shell=True) # I am aware of consequences of using `shell=True`          out, err = process.communicate()         errcode = process.returncode         if errcode != 0:             raise Exception(err)         en = time.time() - st         print 'Command %s: Completed in %s seconds' % (str(i+1), str(round(en, 2)))  if __name__ == '__main__':     src = '/path/to/source/file/'     dst = '/path/to/destination/folder/'     convert(src, dst) Output:
Command 1: Completed in 11.91 seconds Command 2: Completed in 11.55 seconds Environment:
More tools result:
The latest versions (after MS Office 2007) allow you to save the document as a pdf, thus avoiding formatting errors. Go to Files->Save As and select ". pdf format" from Save As Type. Click to save.
Batch Convert Word to PDF with Adobe Acrobat. Step 1: Save all the Word documents that you wish to convert in one folder. Step 2: Open Adobe Acrobat and select 'Create PDF' to begin the batch convert Word to PDF progress. Step 3: Choose 'Multiple Files' > 'Create Multiple PDF Files'.
Try calling unoconv from your Python code, it took 8 seconds on my local machine, I don't know if it's fast enough for you:
time unoconv 15.\ Text-Files.pptx real    0m8.604s Pandoc is a wonderful tool capable of doing what you'd like quickly. Since you're using Popen to effectively shell out the command for the tool, it doesn't matter what language the tool is written in (Pandoc is written in Haskell).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With