Python beginner here. I am struggling to use regex for pandas. I have a rows like this that need to split up into a column containing only the number.
rando45m text78 here 123 $ 1 0% text here 5 . 6&
I need it to be displayed as
0 1 2 3
0 123 1 0 5
I have used the following 2 methods
df2 = df.Keep.str.extractall('(\d+)((\s+)|(\%))')
df3 = df.Keep.str.extractall(r'(?<=\s)(\d+)(?=\s+|\%)')
df2 includes the whitespace in the cell. df3 errors out for an assertion error. Is there a way where I can only capture one group /1 for my dataframe?
Thanks
Try this:
In [39]: df
Out[39]:
Keep
0 rando45m text78 here 123 $ 1 0% text here 5 . 6&
1 aaa 101.5% here 123 $ 1 0% text here 55 .
In [40]: df.Keep.str.extractall(r'\b(\d+(?:\.\d+)?)(?:\s|%|$)').unstack()
Out[40]:
0
match 0 1 2 3 4
0 123 1 0 5 None
1 101.5 123 1 0 55
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With