When using regular expression, I get:
import re
string = r'http://www.example.com/abc.html'
result = re.search('^.*com', string).group()
In pandas, I write:
df = pd.DataFrame(columns = ['index', 'url'])
df.loc[len(df), :] = [1, 'http://www.example.com/abc.html']
df.loc[len(df), :] = [2, 'http://www.hello.com/def.html']
df.str.extract('^.*com')
ValueError: pattern contains no capture groups
How to solve the problem?
Thanks.
According to the docs, you need to specify a capture group (i.e., parentheses) for str.extract
to, well, extract.
Series.str.extract(pat, flags=0, expand=True)
For each subject string in the Series, extract groups from the first match of regular expression pat.
Each capture group constitutes its own column in the output.
df.url.str.extract(r'(.*.com)')
0
0 http://www.example.com
1 http://www.hello.com
# If you need named capture groups,
df.url.str.extract(r'(?P<URL>.*.com)')
URL
0 http://www.example.com
1 http://www.hello.com
Or, if you need a Series,
df.url.str.extract(r'(.*.com)', expand=False)
0 http://www.example.com
1 http://www.hello.com
Name: url, dtype: object
You need specify column url
with ()
for match groups:
df['new'] = df['url'].str.extract(r'(^.*com)')
print (df)
index url new
0 1 http://www.example.com/abc.html http://www.example.com
1 2 http://www.hello.com/def.html http://www.hello.com
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With