Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

awk printing nothing when used in loop [duplicate]

I have a bunch of files using the format file.1.a.1.txt that look like this:

A 1
B 2
C 3
D 4

and was using the following command to add a new column containing the name of each file:

awk '{print FILENAME (NF?"\t":"") $0}' file.1.a.1.txt > file.1.a.1.txt

which ended up making them look how I want:

file.1.a.1.txt A 1
file.1.a.1.txt B 2
file.1.a.1.txt C 3
file.1.a.1.txt D 4

However, I need to do this for multiple files as a job on an HPC using sbatch submission. But when I run the following job script:

#!/bin/bash
#<other SBATCH info>
#SBATCH --array=1-10

N=$SLURM_ARRAY_TASK_ID

for j in {a,b,c};
do
    for i in {1,2,3}
    do awk '{print FILENAME (NF?"\t":"") $0}' file.${N}."$j"."$i".txt > file.${N}."$j"."$i".txt
    done
done

awk is generating empty files. I have tried using cat to call the file and then piping it to awk but that also hasn't worked.

like image 440
Geode Avatar asked May 04 '26 16:05

Geode


1 Answers

You don't need a loop and you cannot redirect STDOUT to the same file you're reading from STDIN, you will get blank files if you do that.

Try this:

#!/bin/bash

N=$SLURM_ARRAY_TASK_ID

awk '
   NF{
      print FILENAME "\t" $0 > FILENAME".tmp"
   }
   ENDFILE{ # requires gawk
      close(FILENAME".tmp") 
   }' file."$N".{a,b,c}.{1,2,3}.txt

for file in file*.tmp; do
   mv "$file" "${file%.tmp}"
done

Note that if you don't have GNU awk to use ENDFILE{} you can remove that stanza and get away with either:

  1. Putting the close() statement just after the print statement (comes with lots of overhead)
  2. Don't call close() at all and as long as you don't have a lot of files, you should be fine.
like image 182
SiegeX Avatar answered May 06 '26 10:05

SiegeX