I want to remove all lines except the line(s) containing matching pattern.
This is how I did it:
sed -n 's/matchingpattern/matchingpattern/p' file.txt
But I'm just curious because I rename matching pattern to the matching pattern itself. Looks like a waste here.
Is there a better way to do this?
To begin with, if you want to delete a line containing the keyword, you would run sed as shown below. Similarly, you could run the sed command with option -n and negated p , (! p) command. To delete lines containing multiple keywords, for example to delete lines with the keyword green or lines with keyword violet.
N command reads the next line in the pattern space. d deletes the entire pattern space which contains the current and the next line. Using the substitution command s, we delete from the newline character till the end, which effective deletes the next line after the line containing the pattern Unix.
It means that sed will read the next line and start processing it. Your test script doesn't do what you think. It matches the empty lines and applies the delete command to them.
First, bring your cursor to the line you want to delete. Press the “Esc” key to change the mode. Now type, “:d”, and press “Enter” to delete the line or quickly press “dd”.
sed '/pattern/!d' file.txt
But you're reinventing grep
here.
grep is certainly better...because it's much faster.
e.g. using grep to extract all genome sequence data for chromosome 6 in a data set I'm working with:
$ time grep chr6 seq_file.in > temp.out real 0m11.902s user 0m9.564s sys 0m1.912s
compared to sed:
$ time sed '/chr6/!d' seq_file.in > temp.out real 0m21.217s user 0m18.920s sys 0m1.860s
I repeated it 3X and ~same values each time.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With