I have a somewhat large output text file where I need to delete all lines between two patterns but retain the pattern match.
The files look vaguely like the following output.
TEST #1
coef1 | 48.36895 3.32013 14.57 0.000 41.86141 54.87649
coef2 | -50.08894 10.47335 -4.78 0.000 -70.61697 -29.56092
|
indicator |
0 | .6647992 2.646627 0.25 0.802 -4.55925 5.888849
1 | 2.118701 5.225777 0.41 0.686 -8.19621 12.43361
|
year |
2 | -.4324005 2.231387 -0.19 0.847 -4.836829 3.972028
3 | -.362762 1.97184 -0.18 0.854 -4.254882 3.529358
|
_cons | 16.95753 6.342342 2.67 0.008 4.526383 29.38869
TEST #2
coef2 | 48.36895 3.32013 14.57 0.000 41.86141 54.87649
coef3 | -50.08894 10.47335 -4.78 0.000 -70.61697 -29.56092
|
year |
4 | .6647992 2.646627 0.25 0.802 -4.55925 5.888849
5 | 2.118701 5.225777 0.41 0.686 -8.19621 12.43361
|
idnumber |
6 | -.4324005 2.231387 -0.19 0.847 -4.836829 3.972028
7 | -.362762 1.97184 -0.18 0.854 -4.254882 3.529358
|
_cons | 16.95753 6.342342 2.67 0.008 4.526383 29.38869
I need to take the following output and delete all the lines between "year" and "_cons" but I need to retain the line starting with "_cons". The desired output is like so:
TEST #1
coef1 | 48.36895 3.32013 14.57 0.000 41.86141 54.87649
coef2 | -50.08894 10.47335 -4.78 0.000 -70.61697 -29.56092
|
indicator |
0 | .6647992 2.646627 0.25 0.802 -4.55925 5.888849
1 | 2.118701 5.225777 0.41 0.686 -8.19621 12.43361
|
year |
_cons | 16.95753 6.342342 2.67 0.008 4.526383 29.38869
TEST #2
coef2 | 48.36895 3.32013 14.57 0.000 41.86141 54.87649
coef3 | -50.08894 10.47335 -4.78 0.000 -70.61697 -29.56092
|
year |
_cons | 16.95753 6.342342 2.67 0.008 4.526383 29.38869
I wrote the following script (under OS X):
sed '/^ +year/,/^ +_cons/{/^ +year/!{/^ +_cons/!d}}' input.txt >output.txt
but I got the following error:
sed: 1: "/^ +year/,/^ +_cons/{/^ ...": extra characters at the end of d command
I'm not sure if this approach is even correct because I can't seem to get sed to execute. Is sed even appropriate here or should I use awk?
One last note, I need this script to work on a relatively generic Unix install. I have to send this to someone who must execute it under a very basic AIX (I think) install. No perl, no python, and I can't do much troubleshooting on their install over email.
Just like in VIM, we will be using the d command to delete specific pattern space with SED. To begin with, if you want to delete a line containing the keyword, you would run sed as shown below. Similarly, you could run the sed command with option -n and negated p , (! p) command.
This should work -
awk '/year/{print; getline; while($0!~/_cons/) {getline}}1' INPUT_FILE
or
awk '/_cons/{print;f=0;next}/year/{f=1;print;next}f{next}1' INPUT_FILE
[jaypal:~/Temp] awk '/year/{print; getline; while($0!~/_cons/) {getline}}1' file
TEST #1
coef1 | 48.36895 3.32013 14.57 0.000 41.86141 54.87649
coef2 | -50.08894 10.47335 -4.78 0.000 -70.61697 -29.56092
|
indicator |
0 | .6647992 2.646627 0.25 0.802 -4.55925 5.888849
1 | 2.118701 5.225777 0.41 0.686 -8.19621 12.43361
|
year |
_cons | 16.95753 6.342342 2.67 0.008 4.526383 29.38869
TEST #2
coef2 | 48.36895 3.32013 14.57 0.000 41.86141 54.87649
coef3 | -50.08894 10.47335 -4.78 0.000 -70.61697 -29.56092
|
year |
_cons | 16.95753 6.342342 2.67 0.008 4.526383 29.38869
[jaypal:~/Temp] awk '/_cons/{print;f=0;next}/year/{f=1;print;next}f{next}1' file
TEST #1
coef1 | 48.36895 3.32013 14.57 0.000 41.86141 54.87649
coef2 | -50.08894 10.47335 -4.78 0.000 -70.61697 -29.56092
|
indicator |
0 | .6647992 2.646627 0.25 0.802 -4.55925 5.888849
1 | 2.118701 5.225777 0.41 0.686 -8.19621 12.43361
|
year |
_cons | 16.95753 6.342342 2.67 0.008 4.526383 29.38869
TEST #2
coef2 | 48.36895 3.32013 14.57 0.000 41.86141 54.87649
coef3 | -50.08894 10.47335 -4.78 0.000 -70.61697 -29.56092
|
year |
_cons | 16.95753 6.342342 2.67 0.008 4.526383 29.38869
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With