Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

subprocess.call does not wait for the process to complete

Per Python documentation, subprocess.call should be blocking and wait for the subprocess to complete. In this code I am trying to convert few xls files to a new format by calling Libreoffice on command line. I assumed that the call to subprocess call is blocking but seems like I need to add an artificial delay after each call otherwise I miss few files in the out directory.

what am I doing wrong? and why do I need the delay?

from subprocess import call

for i in range(0,len(sorted_files)):
            args = ['libreoffice', '-headless', '-convert-to',
                    'xls', "%s/%s.xls" %(sorted_files[i]['filename'],sorted_files[i]['filename']), '-outdir', 'out']
            call(args)
            var = raw_input("Enter something: ") # if comment this line I dont get all the files in out directory

EDIT It might be hard to find the answer through the comments below. I used unoconv for document conversion which is blocking and easy to work with from an script.

like image 379
Kamyar Souri Avatar asked Apr 29 '13 18:04

Kamyar Souri


2 Answers

It's possible likely that libreoffice is implemented as some sort of daemon/intermediary process. The "daemon" will (effectively1) parse the commandline and then farm the work off to some other process, possibly detaching them so that it can exit immediately. (based on the -invisible option in the documentation I suspect strongly that this is indeed the case you have).

If this is the case, then your subprocess.call does do what it is advertised to do -- It waits for the daemon to complete before moving on. However, it doesn't do what you want which is to wait for all of the work to be completed. The only option you have in that scenario is to look to see if the daemon has a -wait option or similar.


1It is likely that we don't have an actual daemon here, only something which behaves similarly. See comments by abernert

like image 160
mgilson Avatar answered Nov 09 '22 19:11

mgilson


The problem is that the soffice command-line tool (which libreoffice is either just a link to, or a further wrapper around) is just a "controller" for the real program soffice.bin. It finds a running copy of soffice.bin and/or creates on, tells it to do some work, and then quits.

So, call is doing exactly the right thing: it waits for libreoffice to quit.

But you don't want to wait for libreoffice to quit, you want to wait for soffice.bin to finish doing the work that libreoffice asked it to do.

It looks like what you're trying to do isn't possible to do directly. But it's possible to do indirectly.

The docs say that headless mode:

… allows using the application without user interface.

This special mode can be used when the application is controlled by external clients via the API.

In other words, the app doesn't quit after running some UNO strings/doing some conversions/whatever else you specify on the command line, it sits around waiting for more UNO commands from outside, while the launcher just runs as soon as it sends the appropriate commands to the app.


You probably have to use that above-mentioned external control API (UNO) directly.

See Scripting LibreOffice for the basics (although there's more info there about internal scripting than external), and the API documentation for details and examples.

But there may be an even simpler answer: unoconv is a simple command-line tool written using the UNO API that does exactly what you want. It starts up LibreOffice if necessary, sends it some commands, waits for the results, and then quits. So if you just use unoconv instead of libreoffice, call is all you need.

Also notice that unoconv is written in Python, and is designed to be used as a module. If you just import it, you can write your own (simpler, and use-case-specific) code to replace the "Main entrance" code, and not use subprocess at all. (Or, of course, you can tear apart the module and use the relevant code yourself, or just use it as a very nice piece of sample code for using UNO from Python.)

Also, the unoconv page linked above lists a variety of other similar tools, some that work via UNO and some that don't, so if it doesn't work for you, try the others.


If nothing else works, you could consider, e.g., creating a sentinel file and using a filesystem watch, so at least you'll be able to detect exactly when it's finished its work, instead of having to guess at a timeout. But that's a real last-ditch workaround that you shouldn't even consider until eliminating all of the other options.

like image 3
abarnert Avatar answered Nov 09 '22 21:11

abarnert