I have a string like first url, second url, third url
and would like to extract only the url
after the word second
in the OS X Terminal (only the first occurrence). How can I do it?
In my favorite editor I used the regex /second (url)/
and used $1
to extract it, I just don't know how to do it in the Terminal.
Keep in mind that url
is an actual url, I'll be using one of these expressions to match it: Regex to match URL
echo 'first url, second url, third url' | sed 's/.*second//'
Edit: I misunderstood. Better:
echo 'first url, second url, third url' | sed 's/.*second \([^ ]*\).*/\1/'
or:
echo 'first url, second url, third url' | perl -nle 'm/second ([^ ]*)/; print $1'
Piping to another process (like 'sed' and 'perl' suggested above) might be very expensive, especially when you need to run this operation multiple times. Bash does support regexp:
[[ "string" =~ regex ]]
Similarly to the way you extract matches in your favourite editor by using $1
, $2
, etc., Bash fills in the $BASH_REMATCH
array with all the matches.
In your particular example:
str="first url1, second url2, third url3"
if [[ $str =~ (second )([^,]*) ]]; then
echo "match: '${BASH_REMATCH[2]}'"
else
echo "no match found"
fi
Output:
match: 'url2'
Specifically, =~
supports extended regular expressions as defined by POSIX, but with platform-specific extensions (which vary in extent and can be incompatible).
On Linux platforms (GNU userland), see man grep
; on macOS/BSD platforms, see man re_format
.
In the other answer provided you still remain with everything after the desired URL. So I propose you the following solution.
echo 'first url, second url, third url' | sed 's/.*second \(url\)*.*/\1/'
Under sed you group an expression by escaping the parenthesis around it (POSIX standard).
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With