I have a file with 4 columns:
ifile.txt
3 5 2 2
1 4 2 1
4 5 7 2
5 5 7 1
0 0 1 1
3 5 7 3
5 4 2 2
I would like to delete the rows whose column 2 & 3 values are same with some previous. for instance, row 2 & 7 have same values in column 2 & 3. Similarly row 3 & 4 & 6 has same values in column 2 & 3. So I want to keep the 2rd row and delete 7th row. Similarly keep 3rd row and delete 4th and 6th row. my output is:
ofile.txt
3 5 2 2
1 4 2 1
4 5 7 2
0 0 1 1
I tried with this command
awk '{a[NR]=$2""$3} a[NR]!=a[NR-1]{print}' ifile.txt > ofile.txt
But it is not giving my desire output.
$ awk '!(($2,$3) in a); {a[$2,$3]}' ifile
3 5 2 2
1 4 2 1
4 5 7 2
0 0 1 1
awk
reads the input file one line at a time. Each input line is divided into fields. In this case, the important fields are the second, denoted $2
, and the third, denoted $3
.
!(($2,$3) in a)
This condition is true if $2,$3
is not a key in associative array a
. Since no action is specified, when this condition is true, the default action is performed which is to print the line.
In more detail, ($2,$3) in a
is true when $2,$3
is a key of a
. We, however, want the condition to be true in the opposite. Consequently, we apply awk's negation operator, !
, to it.
a[$2,$3]
This adds $2,$3
as a key of a
.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With