Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Dropping multiple Pandas columns by Index

I have a large pandas dataframe (>100 columns). I need to drop various sets of columns and i'm hoping there is a way of using the old

df.drop(df.columns['slices'],axis=1)

I've built selections such as:

a = df.columns[3:23]
b = df.colums[-6:]

as a and b represent column sets I want to drop.

The following

list(df)[3:23]+list(df)[-6:]

yields the correct selection, but i can't implement it with a drop:

df.drop(df.columns[list(df)[3:23]+list(df)[-6:]],axis=1)

ValueError: operands could not be broadcast together with shapes (20,) (6,)

I looked around but can't get my answer.

Selecting last n columns and excluding last n columns in dataframe

(Below pertains to the error I receive):

python numpy ValueError: operands could not be broadcast together with shapes

This one feels like they're having a similar issue, but the 'slices' aren't separate: Deleting multiple columns based on column names in Pandas

Cheers

like image 842
BAC83 Avatar asked Aug 09 '18 11:08

BAC83


People also ask

How do you delete multiple index columns in Pandas?

To drop multiple levels from a multi-level column index, use the columns. droplevel() repeatedly.

How do I drop the first 10 columns in Pandas?

Use iloc to drop first column of pandas dataframe. Use drop() to remove first column of pandas dataframe. Use del keyword to remove first column of pandas dataframe. Use pop() to remove first column of pandas dataframe.

How do you delete multiple columns in Python?

You can delete one or multiple columns of a DataFrame. To delete or remove only one column from Pandas DataFrame, you can use either del keyword, pop() function or drop() function on the dataframe. To delete multiple columns from Pandas Dataframe, use drop() function on the dataframe.


Video Answer


3 Answers

This returns the dataframe with the columns removed

df.drop(list(df)[2:5], axis=1)
like image 174
Chabu Avatar answered Sep 27 '22 18:09

Chabu


You can use np.r_ to seamlessly combine multiple ranges / slices:

from string import ascii_uppercase

df = pd.DataFrame(columns=list(ascii_uppercase))

idx = np.r_[3:10, -5:0]

print(idx)

array([ 3,  4,  5,  6,  7,  8,  9, -5, -4, -3, -2, -1])

You can then use idx to index your columns and feed to pd.DataFrame.drop:

df.drop(df.columns[idx], axis=1, inplace=True)

print(df.columns)

Index(['A', 'B', 'C', 'K', 'L', 'M', 'N',
       'O','P', 'Q', 'R', 'S', 'T', 'U'], dtype='object')
like image 35
jpp Avatar answered Sep 27 '22 19:09

jpp


You can use this simple solution:

cols = [3,7,10,12,14,16,18,20,22]
df.drop(df.columns[cols],axis=1,inplace=True)

the result :

    0   1   2   4   5   6   8   9    11  13      15     17      19       21
0   3   12  10  3   2   1   7   512  64  1024.0  -1.0   -1.0    -1.0    -1.0
1   5   12  10  3   2   1   7   16   2   32.0    32.0   1024.0  -1.0    -1.0
2   5   12  10  3   2   1   7   512  2   32.0    32.0   32.0    -1.0    -1.0
3   5   12  10  3   2   1   7   16   1   32.0    64.0   1024.0  -1.0    -1.0

As you can see the columns with given index have been all deleted.

You can replace the int value by the name of the column you have in your array if we suppose you have A,B,C ...etc you can replace int values in cols like this for example :

cols = ['A','B','C','F']
like image 22
DINA TAKLIT Avatar answered Sep 27 '22 18:09

DINA TAKLIT