Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

GNU Parallel and Bash functions: How to run the simple example from the manual

Tags:

I'm trying to learn GNU Parallel because I have a case where I think I could easily parallelize a bash function. So in trying to learn, I went to the GNU Parallel manual where there is an example...but I can't even get it working! To wit:

(232) $ bash --version GNU bash, version 4.1.2(1)-release (x86_64-redhat-linux-gnu) Copyright (C) 2009 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>  This is free software; you are free to change and redistribute it. There is NO WARRANTY, to the extent permitted by law. (233) $ cat tpar.bash #!/bin/bash  echo `which parallel` doit() {   echo Doing it for $1   sleep 2   echo Done with $1 } export -f doit parallel doit ::: 1 2 3 doubleit() {   echo Doing it for $1 $2   sleep 2   echo Done with $1 $2 } export -f doubleit parallel doubleit ::: 1 2 3 ::: a b  (234) $ bash tpar.bash /home/mathomp4/bin/parallel doit: Command not found. doit: Command not found. doit: Command not found. doubleit: Command not found. doubleit: Command not found. doubleit: Command not found. doubleit: Command not found. doubleit: Command not found. doubleit: Command not found. 

As you can see, I can't even get the simple example to run. Thus, I'm probably doing something amazingly stupid and basic...but I'm at a loss.

ETA: As suggested by commenters (chmod +x, set -vx):

(27) $ ./tpar.bash  echo `which parallel` which parallel ++ which parallel + echo /home/mathomp4/bin/parallel /home/mathomp4/bin/parallel  doit() {   echo Doing it for $1   sleep 2   echo Done with $1 } export -f doit + export -f doit parallel doit ::: 1 2 3 + parallel doit ::: 1 2 3 doit: Command not found. doit: Command not found. doit: Command not found. doubleit() {   echo Doing it for $1 $2   sleep 2   echo Done with $1 $2 } export -f doubleit + export -f doubleit parallel doubleit ::: 1 2 3 ::: a b + parallel doubleit ::: 1 2 3 ::: a b doubleit: Command not found. doubleit: Command not found. doubleit: Command not found. doubleit: Command not found. doubleit: Command not found. doubleit: Command not found. 

ETA2: Note, I can, in the script, just call 'doit 1', say, and it will do that. So the function is valid, it just isn't...exported?

like image 650
Fortran Avatar asked May 22 '14 18:05

Fortran


People also ask

What does Parallel do in linux?

GNU Parallel is a shell utility for executing jobs in parallel. It can parse multiple inputs, thereby running your script or command against sets of data at the same time. You can use all your CPU at last! If you've ever used xargs , you already know how to use Parallel.

What is parallel command in Ubuntu?

parallel runs the specified command, passing it a single one of the specified arguments. This is repeated for each argument. Jobs may be run in parallel. The default is to run one job per CPU.


2 Answers

You cannot call a shell function from outside the shell where it was defined. A shell function is a concept inside the shell. The parallel command itself has no way to access it.

Calling export -f doit in bash exports the function via the environment so that it is picked up by child processes. But only bash understands bash functions. A (grand)*child bash process can call it, but not other programs, for example not other shells.

Going by the message “Command not found”, it appears that your preferred shell is (t)csh. You need to tell parallel to invoke bash instead. parallel invokes the shell indicated by the SHELL environment variable¹, so set it to point to bash.

export SHELL=$(type -p bash) doit () { … } export -f doit parallel doit ::: 1 2 3 

If you only want to set SHELL for the execution of the parallel command and not for the rest of the script:

doit () { … } export -f doit SHELL=$(type -p bash) parallel doit ::: 1 2 3 

I'm not sure how to deal with remote jobs, you may need to pass --env=SHELL in addition to --env=doit (note that this assumes that the path to bash is the same everywhere).

And yes, this oddity should be mentioned more prominently in the manual. There's a brief note in the description of the command argument, but it isn't very explicit (it should explain that the command words are concatenated with a space as a separator and then passed to $SHELL -c), and SHELL isn't even listed in the environment variables section. (I encourage you to report this as a bug; I'm not doing it because I hardly ever use this program.)

¹ which is bad design, since SHELL is supposed to indicate a user interface preference for an interactive command line shell, and not to change the behavior of programs.

like image 157
Gilles 'SO- stop being evil' Avatar answered Sep 18 '22 03:09

Gilles 'SO- stop being evil'


Since version 20160722 you can instead use env_parallel:

doit() { echo "$@"; } echo world | env_parallel doit Hello 

You just need to activate env_parallel by adding it to .bashrc. You can add it to .bashrc by running this once:

env_parallel --install 
like image 35
Ole Tange Avatar answered Sep 22 '22 03:09

Ole Tange