I have a pandas dataframe df with columns city1
, city2
, city3
, city4
, city5
. I have a list my_cities = ["city1","city3","city10"]
. I want to subset df according to the columns in my_cities
. When I do,
my_cities = ["city1","city3","city10"]
df_my_cities = df[my_cities]
I get the error KeyError: "['city10'] not in index"
How can I tell the code to keep proceeding if an element from my_cities
in not in df
?
You can use intersection
between all columns and list
:
df_my_cities = df[df.columns.intersection(my_cities)]
Sample:
df = pd.DataFrame({'city1':['s', 'e'],
'city2':['e','f'],
'city3':['f','g'],
'city4':['r','g'],
'city5':['t','m']})
print (df)
city1 city2 city3 city4 city5
0 s e f r t
1 e f g g m
my_cities = ["city1","city3","city10"]
df_my_cities = df[df.columns.intersection(my_cities)]
print (df_my_cities)
city1 city3
0 s f
1 e g
Alternatively numpy.intersect1d
:
df_my_cities = df[np.intersect1d(df.columns, my_cities)]
print (df_my_cities)
city1 city3
0 s f
1 e g
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With