I'm trying to use gnu parallel with some basic bioinformatic tools, e.g. lastz. So say I have 10 seqs, and I want to use lastz on all of them, I use:
parallel --dryrun lastz 'pathToFile/seq{}.fa query.fasta --format=text > LASTZ_results_seq{}' ::: {1..10}
Which works fine and returns:
lastz pathToFile/seq1.fa query.fasta --format=text > LASTZ_results_seq1
lastz pathToFile/seq2.fa query.fasta --format=text > LASTZ_results_seq2
lastz pathToFile/seq3.fa query.fasta --format=text > LASTZ_results_seq3
...
lastz pathToFile/seq10.fa query.fasta --format=text > LASTZ_results_seq10
But ideally I'd like this step to be part of a bash script which takes three command-line arguments, so the number of seqs (eg. 1 to 10) is given in the command-line (with $2 = startValue, $3 = endValue). I thought that changing it to this would work:
parallel --dryrun lastz 'pathToFile/seq{}.fa query.fasta --format=text > LASTZ_results_seq{}' ::: {"$2".."$3"}
but instead, that returns
lastz pathToFile//seq\{\1..\10\} query.fasta --format=text > LASTZ_results_seq\{\1..\10\}
Can anyone please tell me what I'm doing wrong here? It looks like it is interpreting $2 as 1, and $3 as 10, but then fails to treat it as a range of numbers...
Bash ranges doesn't accepts variables, see this post:
How do I iterate over a range of numbers defined by variables in Bash?
thus, I suggest you change {$1..$2} to $(seq $1 $2).
By example, see this test script:
$ cat foo
parallel echo ::: {1..3}
parallel echo ::: {$1..$2}
parallel echo ::: $(seq $1 $2)
when called as ./foo 1 3, it produces following output:
1
2
3
{1..3}
1
2
3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With