I have a pandas data frame with 2 columns, type and text The text column contains string values. How can I delete rows which contains some numeric values in the text column. e.g:
`ABC 1.3.2`, `ABC12`, `2.2.3`, `ABC 12 1`
I have tried below, but get an error. Any idea why this is giving error?
df.drop(df[bool(re.match('^(?=.*[0-9]$)', df['text'].str))].index)
In your case, I think it's better to use simple indexing rather than drop. For example:
>>> df
text type
0 abc b
1 abc123 a
2 cde a
3 abc1.2.3 b
4 1.2.3 a
5 xyz a
6 abc123 a
7 9999 a
8 5text a
9 text a
>>> df[~df.text.str.contains(r'[0-9]')]
text type
0 abc b
2 cde a
5 xyz a
9 text a
That locates any rows with no numeric text
To explain:
df.text.str.contains(r'[0-9]')
returns a boolean series of where there are any digits:
0 False
1 True
2 False
3 True
4 True
5 False
6 True
7 True
8 True
9 False
and you can use this with the ~
to index your dataframe wherever that returns false
Data from jpp
s[s.str.isalpha()]
Out[261]:
0 ABC
2 DEF
6 GHI
dtype: object
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With