How to replace duplicated rows with "." in awk?

Question

I need to substitute duplications in my first column with just "."

For example:

name1
name1
name1
name2
name2
name3
name3

And I need Output:

name1
.
.
name2
.
name3
.

I have solution like this:

awk '{c=$1} c==p{gsub(/./,".",$1)} {p=c} 1' in.file

But the output is:

name1
.....
.....
name2
.....
name3
.....

Is there any solution without any other piping?

fedorqui 'SO stop harming' · Accepted Answer

Use an array to check if a line has already been seen!

$ awk 'seen[$0]++ {$0="."}1' file
name1
.
.
name2
.
name3
.

The typical way to skip repeated lines is to say awk '!seen[$0]++' file. Here we use the same logic but twisting it a little bit: we use the array seen[] to check if a line has appeared so far. If it has, seen[$0]++ will be bigger than 0, so {$0="."} will occur. Then, 1 prints either this or the line.

If you happen to need this to check not the full line but a defined column, do replace $0 (full record) with $n, where n is the n^th field.

How to replace duplicated rows with "." in awk?

Tags:

bash

duplicates

awk

Geroge

1 Answers

fedorqui 'SO stop harming'

Recent Activity

Donate For Us

How to replace duplicated rows with "." in awk?

Tags:

bash

duplicates

awk

Geroge

1 Answers

fedorqui 'SO stop harming'

Related questions

Recent Activity

Donate For Us