Given the following data frame:
import pandas as pd import numpy as np df = pd.DataFrame({'A':['1a',np.nan,'10a','100b','0b'], }) df A 0 1a 1 NaN 2 10a 3 100b 4 0b
I'd like to extract the numbers from each cell (where they exist). The desired result is:
A 0 1 1 NaN 2 10 3 100 4 0
I know it can be done with str.extract
, but I'm not sure how.
To find numbers from a given string in Python we can easily apply the isdigit() method. In Python the isdigit() method returns True if all the digit characters contain in the input string and this function extracts the digits from the string. If no character is a digit in the given string then it will return False.
To extract a number from a string, call the replace method on the string, passing in a regular expression to replace all non-digit characters with an empty string, e.g. str. replace(/\D/g, '') . The replace method returns a new string containing all the numbers from the original string.
get_value() function is used to quickly retrieve the single value in the data frame at the passed column and index. The input to the function is the row label and the column label.
Give it a regex capture group:
df.A.str.extract('(\d+)')
Gives you:
0 1 1 NaN 2 10 3 100 4 0 Name: A, dtype: object
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With