I am new to scripting and was trying to learn how to extract any text that exists between two different patterns. However, I am still not able to figure out how to extract text between two patterns in the following scenario:
If I have my input file reading:
Hi I would like
to print text
between these
patterns
and my expected output is like:
I would like
to print text
between these
i.e. my first search pattern is "Hi' and skip this pattern, but print everything that exists in the same line following that matched pattern. My second search pattern is "patterns" and I would like to completely avoid printing this line or any lines beyond that.
I tried the following:
sed -n '/Hi/,/patterns/p' test.txt
[output]
Hi I would like
to print text
between these
patterns
Next, I tried:
`awk ' /'"Hi"'/ {flag=1;next} /'"pattern"'/{flag=0} flag { print }'` test.txt
[output]
to print text
between these
Can someone help me out in identifying how to achieve this? Thanks in advance
You have the right idea, a mini-state-machine in awk
but you need some slight mods as per the following transcript:
pax> echo 'Hi I would like
to print text
between these
patterns ' | awk '
/patterns/ { echo = 0 }
/Hi / { gsub("^.*Hi ", "", $0); echo = 1 }
{ if (echo == 1) { print } }'
Or, in compressed form:
awk '/patterns/{e=0}/Hi /{gsub("^.*Hi ","",$0);e=1}{if(e==1){print}}'
The output of that is:
I would like
to print text
between these
as requested.
The way this works is as follows. The echo
variable is initially 0
meaning that no echoing will take place.
Each line is checked in turn. If it contains patterns
, echoing is disabled.
If it contains Hi
followed by a space, echoing is turned on and gsub
is used to modify the line to get rid of everything up to the Hi
.
Then, regardless, the line (possibly modified) is echoed when the echo
flag is on.
Now, there's going to be edge cases such as:
Hi
; orpatterns
.You haven't specified how they should be handled so I didn't bother, but the basic concept should be the same.
Updated the solution to remove the line "patterns" :
$ sed -n '/^Hi/,/patterns/{s/^Hi //;/^patterns/d;p;}' file
I would like
to print text
between these
This might work for you (GNU sed):
sed '/Hi /!d;s//\n/;s/.*\n//;ta;:a;s/patterns.*$//;tb;$!{n;ba};:b;/^$/d' file
Just set a flag (f) when you find+replace Hi at the start of a line, clear it when you find patterns, then invoke the default print when the flag is set:
$ awk 'sub(/^Hi /,""){f=1} /patterns/{f=0} f' file
I would like
to print text
between these
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With