How do I delete rows not starting with 'x' in Pandas or keep rows starting with 'x'

Tags:

pandas

I have been at this all morning and have slowly pieced things together. But for the life of me I can not figure out how to use the .str.startswith() function in Pandas.

My XLSX spreadsheet is as follows

1 Name, Registration Date, Phone number
2 John Doe, 2015-11-20T19:54:45Z, 1.1112223333
3 Jane Doe, 2015-11-20T20:44:26Z, 65.1112223333
etc...

So I am importing it as a data frame, cleaning the header so that there are no spaces and such, then I want to delete any rows not starting with '1.' (or keep rows that start with '1.') and delete all others. So in this short example, delete the entire 'Jane Doe' entry since her phone number starts with '65.'

import pandas as pd
df = pd.read_excel('testingpanda.xlsx', sheetname = 'Export 1')
def colHeaderCleaner():
    cols = df.columns
    cols = cols.map(lambda x: x.replace(' ', '_') if isinstance(x, (str, unicode)) else x)
    df.columns = cols
    df.columns = [x.lower() for x in df.columns]

colHeaderCleaner()

#by default it sets the values in 'registrant_phone' as float64, so this is fixing that...
df['registrant_phone'] = df['registrant_phone'].astype('object')

The closest I have gotten, and by that I mean the only line I have been able to execute without annoying tracebacks and other errors is:

df['registrant_phone'] = df['registrant_phone'].str.startswith('1')

But all that does is convert all phone values to 'NaN', it maintains all of the rows and everything as shown below:

print df
[output] name, registration_date, phone_number
[output] John Doe, 2015-11-20T19:54:45Z, NaN
[output] Jane Doe, 2015-11-20T20:44:26Z, NaN

I have searched far too many places to even try to list, I have tried different versions of df.drop and just can't seem to figure anything out. Where do I go from here?

359

asked Feb 03 '16 19:02

Mxracer888

1 Answers

I am a bit confused by your question. In any case, if you have a DataFrame df with a column 'c', and you would like to remove the items starting with 1, then the safest way would be to use something like:

df = df[~df['c'].astype(str).str.startswith('1')]

171

answered Nov 01 '22 09:11

Ami Tavory

Related questions
                            
                                In python, can I redirect the output of print function to stderr?
                            
                                How to store formulas, instead of values, in pandas DataFrame
                            
                                Python - Working out if time now is between two times
                            
                                ipython --pylab vs ipython
                            
                                Python - Infinite while loop, break on user input
                            
                                Error: Setting an array element with a sequence. Python / Numpy
                            
                                Can't get past illogical line pep8 error
                            
                                matplotlib colorbar not working (due to garbage collection?)
                            
                                Password form in PyQt
                            
                                Python how to alias module name (rename with preserving backward compatibility)
                            
                                Python PIL 0.5 opacity, transparency, alpha
                            
                                Can I use multiprocessing.Pool in a method of a class?
                            
                                Convert test client data to JSON
                            
                                Simple prediction using linear regression with python
                            
                                Best way to manage docker containers with supervisord
                            
                                return value from one python script to another
                            
                                efficient way of reading integers from file
                            
                                How to run a Django celery task every 6am and 6pm daily?
                            
                                Convert every character in a String to a Dictionary Key
                            
                                Python, assign function to variable, change optional argument's value

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With