I am converting hundreds of ODT files to PDF files, and it takes a long time doing one after the other. I have a CPU with multiple cores. Is it possible to use bash or python to write a script to do these in parallel? Is there a way to parallelize (not sure if I'm using the right word) batch document conversion using libreoffice from the command line? I have been doing it in python/bash calling the following commands:
libreoffice --headless --convert-to pdf *appsmergeme.odt
OR
subprocess.call(str('cd $HOME; libreoffice --headless --convert-to pdf *appsmergeme.odt'), shell=True);
Thank you!
Tim
Libre Office Writer does not convert . pdf files to . odt. The feature is very much in demand in the form of a process that does not require programming skills.
No, LibreOffice will not convert a PDF to a DOC (or ODT) or so. What you can do is that if you create a Writer document (ODT or DOC), from it you can create a PDF that embeds the source file.
ODT extension is an OpenOffice document file. It is similar to DOC and DOCX formats utilized by the Microsoft Word program. Convert ODT to PDF to get all possible variants to work with such files.
You can run libreoffice as a daemon/service. Please check the following link, maybe it helps you too: Daemonize the LibreOffice service
Other posibility is to use unoconv. "unoconv is a command line utility that can convert any file format that OpenOffice can import, to any file format that OpenOffice is capable of exporting."
Since the author already introduced Python as a valid answer:
import subprocess
import os, glob
from multiprocessing.dummy import Pool # wrapper around the threading module
def worker(fname, dstdir=os.path.expanduser("~")):
subprocess.call(["libreoffice", "--headless", "--convert-to", "pdf", fname],
cwd=dstdir)
pool = Pool()
pool.map(worker, glob.iglob(
os.path.join(os.path.expanduser("~"), "*appsmergeme.odt")
))
Using a thread pool instead of a process pool by multiprocessing.dummy
is sufficient because new processes for real parallelism are spawn by subprocess.call()
anyway.
We can set the command as well as the current working directory cwd
directly. No need to load a shell
for each file for just doing that. Furthermore, os.path
enables cross-platform interoperability.
this thread or answer is old. I tested libreoffice 4.4, I can confirm I can run libreoffice concurrently. see my script.
for odt in test*odt ; do
echo $odt
soffice --headless --convert-to pdf $odt &
ps -ef|grep ffice
done
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With