I have a dataframe customers with some "bad" rows, the key in this dataframe is CustomerID. I know I should drop these rows. I have a list called badcu that says [23770, 24572, 28773, ...]
each value corresponds to a different "bad" customer.
Then I have another dataframe, lets call it sales, so I want to drop all the records for the bad customers, the ones in the badcu list.
If I do the following
sales[sales.CustomerID.isin(badcu)]
I got a dataframe with precisely the records I want to drop, but if I do a
sales.drop(sales.CustomerID.isin(badcu))
It returns a dataframe with the first row dropped (which is a legitimate order), and the rest of the rows intact (it doesn't delete the bad ones), I think I know why this happens, but I still don't know how to drop the incorrect customer id rows.
Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).
drop() method you can drop/remove/delete rows from DataFrame. axis param is used to specify what axis you would like to remove. By default axis = 0 meaning to remove rows. Use axis=1 or columns param to remove columns.
To delete rows and columns from DataFrames, Pandas uses the “drop” function. To delete a column, or multiple columns, use the name of the column(s), and specify the “axis” as 1. Alternatively, as in the example below, the 'columns' parameter has been added in Pandas which cuts out the need for 'axis'.
You need
new_df = sales[~sales.CustomerID.isin(badcu)]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With