Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas: how to eliminate rows with value ending with a specific character?

I have a pandas DataFrame as follows:

mail = DataFrame({'mail' : ['[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]', '[email protected]']})

that looks like:

                    mail
0          [email protected]
1        [email protected]
2       [email protected]
3   [email protected]
4  [email protected]
5  [email protected]
6       [email protected]

What I want to do is to filter out (elimiante) all those rows in which the value in the column mail ends with '@gmail.com'.

like image 559
Blue Moon Avatar asked Dec 15 '22 12:12

Blue Moon


2 Answers

You can use str.endswith and negate the result of the boolean Series with ~:

mail[~mail['mail'].str.endswith('@gmail.com')]

Which produces:

                    mail
2       [email protected]
3   [email protected]
4  [email protected]
5  [email protected]
6       [email protected]

Pandas has many other vectorised string operations which are accessible through the .str accessor. Many of these are instantly familiar from Python's own string methods, but come will built in handling of NaN values.

like image 186
Alex Riley Avatar answered Feb 01 '23 22:02

Alex Riley


A column with type str has a field .str on it, using which you can access the standard functions defined for a single str:

[6]: mail['mail'].str.endswith('gmail.com')
      Out[6]:
0     True
1     True
2    False
3    False
4    False
5    False
6    False
Name: mail, dtype: bool

Then you can filter using this Series:

[7]: mail[~mail['mail'].str.endswith('gmail.com')]
      Out[7]:
                    mail
2       [email protected]
3   [email protected]
4  [email protected]
5  [email protected]
6       [email protected]

A similar property .dt exists for accessing date/time related properties of a column if it contains date-data.

like image 26
musically_ut Avatar answered Feb 02 '23 00:02

musically_ut