Deleting multiple columns based on column names in Pandas

Tags:

python

pandas

I have some data and when I import it, I get the following unneeded columns. I'm looking for an easy way to delete all of these.

'Unnamed: 24', 'Unnamed: 25', 'Unnamed: 26', 'Unnamed: 27',
'Unnamed: 28', 'Unnamed: 29', 'Unnamed: 30', 'Unnamed: 31',
'Unnamed: 32', 'Unnamed: 33', 'Unnamed: 34', 'Unnamed: 35',
'Unnamed: 36', 'Unnamed: 37', 'Unnamed: 38', 'Unnamed: 39',
'Unnamed: 40', 'Unnamed: 41', 'Unnamed: 42', 'Unnamed: 43',
'Unnamed: 44', 'Unnamed: 45', 'Unnamed: 46', 'Unnamed: 47',
'Unnamed: 48', 'Unnamed: 49', 'Unnamed: 50', 'Unnamed: 51',
'Unnamed: 52', 'Unnamed: 53', 'Unnamed: 54', 'Unnamed: 55',
'Unnamed: 56', 'Unnamed: 57', 'Unnamed: 58', 'Unnamed: 59',
'Unnamed: 60'

They are indexed by 0-indexing so I tried something like

df.drop(df.columns[[22, 23, 24, 25, 
26, 27, 28, 29, 30, 31, 32 ,55]], axis=1, inplace=True)

But this isn't very efficient. I tried writing some for loops but this struck me as bad Pandas behaviour. Hence i ask the question here.

I've seen some examples which are similar (Drop multiple columns in pandas) but this doesn't answer my question.

381

asked Feb 16 '15 09:02

Peadar Coyle

4 Answers

By far the simplest approach is:

yourdf.drop(['columnheading1', 'columnheading2'], axis=1, inplace=True)

176

answered Oct 19 '22 17:10

Philipp Schwarz

I don't know what you mean by inefficient but if you mean in terms of typing it could be easier to just select the cols of interest and assign back to the df:

df = df[cols_of_interest]

Where cols_of_interest is a list of the columns you care about.

Or you can slice the columns and pass this to drop:

df.drop(df.ix[:,'Unnamed: 24':'Unnamed: 60'].head(0).columns, axis=1)

The call to head just selects 0 rows as we're only interested in the column names rather than data

update

Another method: It would be simpler to use the boolean mask from str.contains and invert it to mask the columns:

In [2]:
df = pd.DataFrame(columns=['a','Unnamed: 1', 'Unnamed: 1','foo'])
df

Out[2]:
Empty DataFrame
Columns: [a, Unnamed: 1, Unnamed: 1, foo]
Index: []

In [4]:
~df.columns.str.contains('Unnamed:')

Out[4]:
array([ True, False, False,  True], dtype=bool)

In [5]:
df[df.columns[~df.columns.str.contains('Unnamed:')]]

Out[5]:
Empty DataFrame
Columns: [a, foo]
Index: []

answered Oct 19 '22 17:10

EdChum

My personal favorite, and easier than the answers I have seen here (for multiple columns):

df.drop(df.columns[22:56], axis=1, inplace=True)

answered Oct 19 '22 18:10

sheldonzy

This is probably a good way to do what you want. It will delete all columns that contain 'Unnamed' in their header.

for col in df.columns:
    if 'Unnamed' in col:
        del df[col]

answered Oct 19 '22 18:10

knightofni

Related questions
                            
                                What does hash do in python?
                            
                                Construct pandas DataFrame from items in nested dictionary
                            
                                Pandas split DataFrame by column value
                            
                                What is the id( ) function used for?
                            
                                How to calculate probability in a normal distribution given mean & standard deviation?
                            
                                Add Text on Image using PIL
                            
                                Print new output on same line [duplicate]
                            
                                bash: mkvirtualenv: command not found
                            
                                How to determine whether a column/variable is numeric or not in Pandas/NumPy?
                            
                                how to "reimport" module to python then code be changed after import
                            
                                Permission denied when activating venv
                            
                                How to round the minute of a datetime object
                            
                                How to get text with Selenium WebDriver in Python
                            
                                how to split an iterable in constant-size chunks
                            
                                enumerate() for dictionary in Python
                            
                                Java: Equivalent of Python's range(int, int)?
                            
                                Spark DataFrame groupBy and sort in the descending order (pyspark)
                            
                                Resource u'tokenizers/punkt/english.pickle' not found
                            
                                Is there a more elegant way to express ((x == a and y == b) or (x == b and y == a))?
                            
                                Kill process by name?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With