I have a file that I am attempting to strip customer names from using AWK. The file is a fixed width file, and every column has meaning.
The file consists of many lines, all the same format, very similar to the below:
1234-123 123456 12345678901234CUSTOMER NAME TO REMOVE12345-1234 TRN 123-123 12345678901-1234 TRN 12345678
1234-123 123456 12345678901234CUSTOMER NAME TO REMOVE12345-1234 TRN 123-123 12345678901-1234 TRN 12345678
1234-123 123456 12345678901234CUSTOMER NAME TO REMOVE12345-1234 TRN 123-123 12345678901-1234 TRN 12345678
1234-123 123456 12345678901234CUSTOMER NAME TO REMOVE12345-1234 TRN 123-123 12345678901-1234 TRN 12345678
It is the customer name that I need to swap with an imaginary name so that the desired output is:
1234-123 123456 12345678901234SENTINAL PRIME 12345-1234 TRN 123-123 12345678901-1234 TRN 12345678
1234-123 123456 12345678901234OPTIMUS PRIME 12345-1234 TRN 123-123 12345678901-1234 TRN 12345678
1234-123 123456 12345678901234BUMBLE BEE 12345-1234 TRN 123-123 12345678901-1234 TRN 12345678
1234-123 123456 12345678901234IRON HIDE 12345-1234 TRN 123-123 12345678901-1234 TRN 12345678
I have a list of transformer names that I would like to use for this, stored in a file called transformer.names
.
SENTINEL PRIME
OPTIMUS PRIME
BUMBLEBEE
IRONHIDE
However, to keep each line of the original file the same width, I need to right pad the transformer names with spaces as the transformer names I have are all different lengths.
It seems to be possible to right pad these names to a certain length using AWK but I have not managed to figure it out (or find a clear enough answer) for me to understand yet.
Below is my current AWK script.
#!/usr/bin/awk -f
BEGIN {
}
{
getline line < "transformer.names"
print substr($0, 0, 30) line substr($0, 62, 120)
}
I run it with this command:
my_program.awk my-file.txt
I think I can include a line something like this in place of the print line above, however I have not managed to get it working yet.
printf "-%32s|", substr($0, 0, 30) line substr($0, 62, 120)
Any tips would be fantastic!
You need to apply the %Ns
to the specific field you want to pad not the whole line and you need to make the minus (for leftpad/rightalign) part of the specifier, and also printf
does not automatically add a line/record separator as print
does so you need to add that:
printf "%s%-32s%s\n", substr($0, 1, 30), newname, substr($0, 62, 120)
# note commas; this is a format string containing three specifiers,
# and separate three data values used for those three specifiers
Alternatively you could pad the field and then concatenate:
print substr($0,1,30) sprintf("%-32s", newname) substr($0,62,120)
# no commas except within the sprintf (and the substr's)
If your data file has more lines than your 'transformernames' file, then you need to buffer the names and cycle through them repeatedly, as Ravinder shows.
Also, substr
positions in awk start at 1; if you specify 0 or negative, it is treated as 1 but I think it's clearer to actually say what you mean, so I fixed that. 62 is not the correct starting position for the part after the customer name in the example data you posted, but you said that data is only 'very similar' to the real data, so I don't know whether 56 or 62 or something else is correct.
Could you please try following and let me know if this helps you. So it will have all transformer names and let's say it has lesser values than Input_file lines then it will keep printing lines from starting of it.
awk '
FNR==NR{
a[FNR]=$0;
count=FNR;
next}
{
val=val==count?1:++val;
print substr($0,1,32) a[val]"\t\t"substr($0,56)
}' transformer.names Input_file
Explanation: Adding explanation for above code too now.
awk '
FNR==NR{ ##Checking condition here FNR==NR which will be TRUE when first Input_file is being read.
a[FNR]=$0; ##Creating an array named a whose index is FNR and value is current line.
count=FNR; ##Creating variable count whose value is FNR value(current line number value of first Input_file).
next} ##next will skip further statements from here onward.
{ ##This block will execute when 2nd Input_file is being read.
val=val==count?1:++val; ##Creating variable val whose value is increment each time and when it is equal to count it is set to 1 then.
print substr($0,1,32) a[val]"\t\t"substr($0,56) ##Printing sub-string from 1 to 32 chars, value of a[val] TABs then sub-string from 56 char to till last of line.
}' transformer.names Input_file ##Mentioning Input_file(s) name here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With