Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Using variables in awk within echo statement that prints into a file

Tags:

bash

csv

awk

We use a script that prints bash commands into a file that is then run on an HPC system. It is supposed to run through a large text file containing geographic coordinates separated by whitespace and extract a specific region from that file (e.g. extract all lines with an x coordinate between xmin and xmax and an y coordinate between ymin and ymax).

Ideally, I'd like to use awk for that like so (from memory since I don't have my computer available at the moment):

awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile

That would probably execute fine. However, as suggested by the title, we save this line indirectly for 25 regions, each with their own xmin, xmax etc. There are more operations following after that (using GMT calls etc). Here's a little snippet:

xmin=-13000
xmax=13000
ymin=-500
ymax=500
infile=./full_file.txt
outfile=./filtered_file.yxy
srcfile=./region_1.txt

echo """awk -v xmin=$xmin -v xmax=$xmax -v ymin=$ymin -v ymax=$ymax -F ' ' {if ($1 > $xmin && $1 < $xmin && $2 > $ymin && $2 < $ymin) print $1 $2} $infile > $outfile""" >> $srcfile

Obviously, this raises errors when running due to variable expansion. I've tried escaping the awk column identifiers but to no avail or didn't understand the pattern correctly. Could someone point me to a solution that allows us to keep the indirect approach?

like image 684
Sacha Viquerat Avatar asked Aug 31 '25 03:08

Sacha Viquerat


2 Answers

IIUC, you have to either escape each dollar sign like that:

{if (\$1 > xmin && \$1 < xmin

or temporarily close a double quote and put a dollar sign in a single quote:

"{if ("'$1'" > xmin && "'$1'" < xmin"

or use Bash specific %q printf specifier:

$ read
awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile
$ printf "%q\n" "$REPLY"
awk\ -v\ xmin=-13000\ -v\ xmax=13000\ -v\ ymin=-500\ -v\ ymax=500\ -F\ \'\ \'\ \{if\ \(\$1\ \>\ xmin\ \&\&\ \$1\ \<\ xmin\ \&\&\ \$2\ \>\ ymin\ \&\&\ \$2\ \<\ ymin\)\ print\ \$1\ \$2\}\ \$infile\ \>\ \$outfile
$ echo awk\ -v\ xmin=-13000\ -v\ xmax=13000\ -v\ ymin=-500\ -v\ ymax=500\ -F\ \'\ \'\ \{if\ \(\$1\ \>\ xmin\ \&\&\ \$1\ \<\ xmin\ \&\&\ \$2\ \>\ ymin\ \&\&\ \$2\ \<\ ymin\)\ print\ \$1\ \$2\}\ \$infile\ \>\ \$outfile
awk -v xmin=-13000 -v xmax=13000 -v ymin=-500 -v ymax=500 -F ' ' {if ($1 > xmin && $1 < xmin && $2 > ymin && $2 < ymin) print $1 $2} $infile > $outfile

And also I think it would be good to enclose awk code in ' if you don't want shell to expand variables.

like image 128
Arkadiusz Drabczyk Avatar answered Sep 02 '25 17:09

Arkadiusz Drabczyk


Creating a separate temporary script seems superfluous. Just loop over the parameters.

while read -r xmin xmax ymin ymax\
              infile outfile
do
    awk -v xmin="$xmin" -v xmax="$xmax" -v ymin="$ymin" -v ymax="$ymax" \
     '$1 > xmin && $1 < xmax && $2 > ymin && $2 < ymax { print $1 $2 }' "$infile" > "$outfile"
done <<____
-13000 13000 -500 500 full_file.txt  filtered_file.yxy
    17    42  19   21 littlefile.txt other.yxy
-27350 27350 -123 123 another.txt    moar.yxy
____

The ____ is just a cute alternative to the more conventional EOF heredoc delimiter. The lines in the here document should each be one set of values for the variables in the read.

If you really want to print each snippet to a separate file (perhaps to submit each to run on a different cluster node, for example), maybe learn to use printf instead of echo.

while read -r xmin xmax ymin ymax\
              infile outfile srcfile
do
    printf 'awk -v xmin="%i" -v xmax="%i" -v ymin="%i" -v ymax="%i" \
     '"'"'$1 > xmin && $1 < xmax && $2 > ymin && $2 < ymax { print $1 $2 }'"'"' "./%s" > "./%s"\n' \
        "$xmin" "$xmax" "$ymin" "$ymax" "$infile" "$outfile" >>"./$srcfile"
done <<____
-13000 13000 -500 500 full_file.txt  filtered_file.yxy region1.txt
    17    42  19   21 littlefile.txt other.yxy         region2.txt 
-27350 27350 -123 123 another.txt    moar.yxy          region3.txt
____

(though printing commands to .txt files is still really weird).

For what it's worth, the triple quotes in your attempt do nothing useful. Python (for example) has this syntax, but in the shell, """ simply parses into an empty string inside a pair of quotes "" followed by an opening double quote ".

Similarly, the printf example above demonstrates one way to produce a literal single quote inside a single-quoted string. 'foo'"'"'bar' is (single-quoted) foo next to double-quoted ' next to single-quoted bar, which when pasted together produces foo'bar.

I also slightly refactored your Awk script to make it more idiomatic, and fixed missing quoting

like image 23
tripleee Avatar answered Sep 02 '25 16:09

tripleee