Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to print lines with duplicated fields?

Tags:

sed

awk

I need to print lines with duplicated fields, tried using sed it's not working.
Input file has two lines:

s1/s2/s3/s4/s5/u0 a1_b2_c3_d4_e5_f6_g7 s1/s2/s3/s4/s5/u1
s1/s2/s3/s4/s5/u0 a1_b2_c3_d4_e5_f6_g7 s1/s2/s3/s4/s5/u0

Output should be only second line, because it has exact duplicated strings (fields).
But it's printing both lines using below command

sed -rn '/(\b\w+\b).*\b\1\b/ p' input_file

Thanks
RKP

like image 433
Raj KP Avatar asked Dec 17 '22 18:12

Raj KP


1 Answers

With grep if -P is available or with perl

$ cat ip.txt
s1/s2/s3/s4/s5/u0 a1_b2_c3_d4_e5_f6_g7 s1/s2/s3/s4/s5/u1
s1/s2/s3/s4/s5/u0 a1_b2_c3_d4_e5_f6_g7 s1/s2/s3/s4/s5/u0
2.5 42 32.5 abc
3.14 3.14 123
part cop par

$ grep -P '(?<!\S)(\S++).*(?<!\S)\1(?!\S)' ip.txt
s1/s2/s3/s4/s5/u0 a1_b2_c3_d4_e5_f6_g7 s1/s2/s3/s4/s5/u0
3.14 3.14 123

$ perl -ne 'print if /(?<!\S)(\S++).*(?<!\S)\1(?!\S)/' ip.txt
s1/s2/s3/s4/s5/u0 a1_b2_c3_d4_e5_f6_g7 s1/s2/s3/s4/s5/u0
3.14 3.14 123
  • (?<!\S) assertion for no non-whitespace character
  • (\S++) capture all non-whitespace characters, possessive quantifier ensures partial fields won't match
  • .* any number of in between characters
  • (?<!\S)\1(?!\S) match entire field, courtesy lookaround assertions for non-whitespace characters
like image 133
Sundeep Avatar answered Dec 20 '22 06:12

Sundeep