Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Rgex doen't work with sed command as expected

I have a text file containing :

A 25 27 50

B 35 75

C 75 78

D 99 88 76

I wanted to delete the line that does not have the fourth field(the fourth pair of digits). Expected output :

A 25 27 50

D 99 88 76

I know that awk command would be the best option for such task, but i'm wondering what's the problem with my sed command since it should work as you can see below :

sed -E '/^[ABCD] ([0-9][0-9]) \1$/d' text.txt

Using POSIX ERE with back-referencing (\1) to refer to the previous pattern surrounded with parenthesis.

I have tried this command instead :

sed -E '/^[ABCD] ([0-9][0-9]) [0-9][0-9]$/d' text.txt

But it seems to delete only the first occurrence of what i want. I would appreciate further explanation of,

  • why the back-referencing doesn't work as expected.
  • what's the matter with the first occurrence in the second attempt,should i included global option if yes then how, since i already tried adding it at the end along side with /d (for delete) but it didn't work .
like image 617
Ayoub_Prog Avatar asked Dec 31 '22 12:12

Ayoub_Prog


1 Answers

Much much easier with awk:

awk 'NF == 4' file

A 25 27 50
D 99 88 76

This awk command uses default field separator of space or tab and checks a condition NF == 4 to make sure we print lines with 4 fields only.


With sed it would be (assuming no leading+trailing spaces in each line):

sed -nE '/^[^[:blank:]]+([[:blank:]]+[^[:blank:]]+){3}$/p' file

A 25 27 50
D 99 88 76
like image 183
anubhava Avatar answered Jan 05 '23 03:01

anubhava