Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to use sed/grep to extract text between two words?

People also ask

What is the difference between sed and grep?

The sed command is a stream editor that works on streams of characters. It's a more powerful tool than grep as it offers more options for text processing purposes, including the substitute command, which sed is most commonly known for.


GNU grep can also support positive & negative look-ahead & look-back: For your case, the command would be:

echo "Here is a string" | grep -o -P '(?<=Here).*(?=string)'

If there are multiple occurrences of Here and string, you can choose whether you want to match from the first Here and last string or match them individually. In terms of regex, it is called as greedy match (first case) or non-greedy match (second case)

$ echo 'Here is a string, and Here is another string.' | grep -oP '(?<=Here).*(?=string)' # Greedy match
 is a string, and Here is another 
$ echo 'Here is a string, and Here is another string.' | grep -oP '(?<=Here).*?(?=string)' # Non-greedy match (Notice the '?' after '*' in .*)
 is a 
 is another 

sed -e 's/Here\(.*\)String/\1/'

The accepted answer does not remove text that could be before Here or after String. This will:

sed -e 's/.*Here\(.*\)String.*/\1/'

The main difference is the addition of .* immediately before Here and after String.


You can strip strings in Bash alone:

$ foo="Here is a String"
$ foo=${foo##*Here }
$ echo "$foo"
is a String
$ foo=${foo%% String*}
$ echo "$foo"
is a
$

And if you have a GNU grep that includes PCRE, you can use a zero-width assertion:

$ echo "Here is a String" | grep -Po '(?<=(Here )).*(?= String)'
is a

Through GNU awk,

$ echo "Here is a string" | awk -v FS="(Here|string)" '{print $2}'
 is a 

grep with -P(perl-regexp) parameter supports \K, which helps in discarding the previously matched characters. In our case , the previously matched string was Here so it got discarded from the final output.

$ echo "Here is a string" | grep -oP 'Here\K.*(?=string)'
 is a 
$ echo "Here is a string" | grep -oP 'Here\K(?:(?!string).)*'
 is a 

If you want the output to be is a then you could try the below,

$ echo "Here is a string" | grep -oP 'Here\s*\K.*(?=\s+string)'
is a
$ echo "Here is a string" | grep -oP 'Here\s*\K(?:(?!\s+string).)*'
is a