I have a pandas
dataframe, df
.
I want to select all indices in df
that are not in a list, blacklist.
Now, I use list comprehension to create the desired labels to slice.
ix=[i for i in df.index if i not in blacklist] df_select=df.loc[ix]
Works fine, but may be clumsy if I need to do this often.
Is there a better way to do this?
Select Rows & Columns by Name or Index in Pandas DataFrame using [ ], loc & iloc. Indexing in Pandas means selecting rows and columns of data from a Dataframe.
To slice the columns, the syntax is df. loc[:,start:stop:step] ; where start is the name of the first column to take, stop is the name of the last column to take, and step as the number of indices to advance after each extraction; for example, you can select alternate columns.
Indexing in pandas means simply selecting particular rows and columns of data from a DataFrame. Indexing could mean selecting all the rows and some of the columns, some of the rows and all of the columns, or some of each of the rows and columns. Indexing can also be known as Subset Selection.
Use isin
on the index and invert the boolean index to perform label selection:
In [239]: df = pd.DataFrame({'a':np.random.randn(5)}) df Out[239]: a 0 -0.548275 1 -0.411741 2 -1.187369 3 1.028967 4 -2.755030 In [240]: t = [2,4] df.loc[~df.index.isin(t)] Out[240]: a 0 -0.548275 1 -0.411741 3 1.028967
You could use set()
to create the difference between your original indices and those that you want to remove:
df.loc[set(df.index) - set(blacklist)]
It has the advantage of being parsimonious, as well as being easier to read than a list comprehension.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With