Non greedy (reluctant) regex matching in sed?

Tags:

I'm trying to use sed to clean up lines of URLs to extract just the domain.

So from:

http://www.suepearson.co.uk/product/174/71/3816/

I want:

http://www.suepearson.co.uk/

(either with or without the trailing slash, it doesn't matter)

I have tried:

 sed 's|\(http:\/\/.*?\/\).*|\1|'

and (escaping the non-greedy quantifier)

sed 's|\(http:\/\/.*\?\/\).*|\1|'

but I can not seem to get the non-greedy quantifier (?) to work, so it always ends up matching the whole string.

785

asked Jul 09 '09 10:07

Joel

1 Answers

Neither basic nor extended Posix/GNU regex recognizes the non-greedy quantifier; you need a later regex. Fortunately, Perl regex for this context is pretty easy to get:

perl -pe 's|(http://.*?/).*|\1|'

130

answered Sep 23 '22 23:09

chaos

Related questions
                            
                                Greedy vs. Reluctant vs. Possessive Qualifiers
                            
                                How can I find all matches to a regular expression in Python?
                            
                                How to remove non-alphanumeric characters?
                            
                                Regex match one of two words
                            
                                How do I grep for all non-ASCII characters?
                            
                                Can you provide some examples of why it is hard to parse XML and HTML with a regex? [closed]
                            
                                What regex will match every character except comma ',' or semi-colon ';'?
                            
                                Check if string matches pattern
                            
                                Removing empty lines in Notepad++
                            
                                How to input a regex in string.replace?
                            
                                Regex: match everything but specific pattern
                            
                                How can I write a regex which matches non greedy? [duplicate]
                            
                                Split Java String by New Line
                            
                                What special characters must be escaped in regular expressions?
                            
                                Regex: ignore case sensitivity
                            
                                Java string split with "." (dot) [duplicate]
                            
                                Regex: Remove lines containing "help", etc
                            
                                Case insensitive regular expression without re.compile?
                            
                                How can I extract a number from a string in JavaScript?
                            
                                How to extract a substring using regex

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Non greedy (reluctant) regex matching in sed?

Tags:

regex

sed

regex-greedy

pcre

greedy

Joel

People also ask

1 Answers

chaos

Recent Activity

Donate For Us