I have the following Pandas DataFrame object df
. It is a train schedule listing the date of departure, scheduled time of departure, and train company.
import pandas as pd
df =
Year Month DayofMonth DayOfWeek DepartureTime Train Origin
Datetime
1988-01-01 1988 1 1 5 1457 BritishRail Leeds
1988-01-02 1988 1 2 6 1458 DeutscheBahn Berlin
1988-01-03 1988 1 3 7 1459 SNCF Lyons
1988-01-02 1988 1 2 6 1501 BritishRail Ipswich
1988-01-02 1988 1 2 6 1503 NMBS Brussels
....
Now, let's say I wanted to select all items "DeutscheBahn" in the column "Train".
I would use
DB = df[df['Train'] == 'DeutscheBahn']
Now, how can I select all trains except DeutscheBahn and British Rails and SNCF. How can I simultaneously choose the items not these?
notDB = df[df['Train'] != 'DeutscheBahn']
and
notSNCF = df[df['Train'] != 'SNCF']
but I am not sure how to combine these into one command.
df[df['Train'] != 'DeutscheBahn', 'SNCF']
doesn't work.
To exclude text, use the "Not" criteria followed by the word or phrase you want to exclude.
df[~df['Train'].isin(['DeutscheBahn', 'SNCF'])]
isin
returns the values in df['Train']
that are in the given list, and the ~
at the beginning is essentially a not
operator.
Another working but longer syntax would be:
df[(df['Train'] != 'DeutscheBahn') & (df['Train'] != 'SNCF')]
I like using the query method as it's a bit more clear
df = df.query("Train not in ['DeutscheBahn', 'British Rails', 'SNCF']")
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With