Extract string with Python re.match

Tags:

import re str="x8f8dL:s://www.qqq.zzz/iziv8ds8f8.dafidsao.dsfsi"  str2=re.match("[a-zA-Z]*//([a-zA-Z]*)",str) print str2.group()  current result=> error expected => wwwqqqzzz

I want to extract the string wwwqqqzzz. How I do that?

Maybe there are a lot of dots, such as:

"whatever..s#[email protected].:af//wwww.xxx.yn.zsdfsd.asfds.f.ds.fsd.whatever/123.dfiid"

In this case, I basically want the stuff bounded by // and /. How do I achieve that?

One additional question:

import re str="xxx.yyy.xxx:80"  m = re.search(r"([^:]*)", str) str2=m.group(0) print str2 str2=m.group(1) print str2

Seems that m.group(0) and m.group(1) are the same.

893

asked Nov 16 '12 20:11

runcode

1 Answers

match tries to match the entire string. Use search instead. The following pattern would then match your requirements:

m = re.search(r"//([^/]*)", str) print m.group(1)

Basically, we are looking for /, then consume as many non-slash characters as possible. And those non-slash characters will be captured in group number 1.

In fact, there is a slightly more advanced technique that does the same, but does not require capturing (which is generally time-consuming). It uses a so-called lookbehind:

m = re.search(r"(?<=//)[^/]*", str) print m.group()

Lookarounds are not included in the actual match, hence the desired result.

This (or any other reasonable regex solution) will not remove the .s immediately. But this can easily be done in a second step:

m = re.search(r"(?<=//)[^/]*", str) host = m.group() cleanedHost = host.replace(".", "")

That does not even require regular expressions.

Of course, if you want to remove everything except for letters and digits (e.g. to turn www.regular-expressions.info into wwwregularexpressionsinfo) then you are better off using the regex version of replace:

cleanedHost = re.sub(r"[^a-zA-Z0-9]+", "", host)

157

answered Sep 30 '22 18:09

Martin Ender

Related questions
                            
                                Magento getParam v $_GET
                            
                                Get Current Location On Google Map
                            
                                How to make an image fit into a circular frame in android
                            
                                Eclipse on Mac 10.8 - Installed 1.7.0 JRE / JDK, but Eclipse won't launch
                            
                                how to generate Narcissistic numbers faster?
                            
                                Failed to Initialize GLEW. Missing GL version [closed]
                            
                                Getting the first (and only value) from a collection [duplicate]
                            
                                Why noexcept is not enforced at compile time?
                            
                                Rails: Restrict API requests to JSON format
                            
                                maven calls external script on both Linux and Windows platforms
                            
                                Application.Current is null when calling from a unittest
                            
                                Node.js Express3 - Middleware to add render data to all render requests

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With