Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas set_index doesn't drop the column

I run the following code function on my dataframe:

del dfname["Unnamed: 0"]
dfname["date"] = pd.to_datetime(dfname["date"])
dfname.set_index(dfname["date"], drop=True, inplace=True)

But the column does not drop (I know that the default is drop=True)

The output dataframe looks like this. I'm using Python 3.6

like image 829
Michael Avatar asked Feb 20 '18 16:02

Michael


People also ask

How do I drop a column in Pandas?

During the data analysis operation on a dataframe, you may need to drop a column in Pandas. You can drop column in pandas dataframe using the df. drop(“column_name”, axis=1, inplace=True) statement.

What does the set_index () method do?

The set_index() function is used to set the DataFrame index using existing columns. Set the DataFrame index (row labels) using one or more existing columns or arrays of the correct length. The index can replace the existing index or expand on it.

How do I drop a column in Pandas based on index?

Pandas Drop Multiple Columns By Index You can use df. columns[[index1, index2, indexn]] to identify the list of column names in that index position and pass that list to the drop method. Note that an index is 0 based. Use 0 to delete the first column and 1 to delete the second column and so on.

How do you drop a multilevel column in Pandas?

To drop multiple levels from a multi-level column index, use the columns. droplevel() repeatedly. We have used the Multiindex. from_tuples() is used to create indexes column-wise.


1 Answers

Change column of DataFrame to column name, also drop = True is default, so is possible remove it:

dfname.set_index(dfname["date"], drop = True, inplace = True)

to:

dfname.set_index("date", inplace = True)

Sample:

rng = pd.date_range('2017-04-03', periods=10)
dfname = pd.DataFrame({'date': rng, 'a': range(10)})  

dfname.set_index("date", inplace = True)
print (dfname)
            a
date         
2017-04-03  0
2017-04-04  1
2017-04-05  2
2017-04-06  3
2017-04-07  4
2017-04-08  5
2017-04-09  6
2017-04-10  7
2017-04-11  8
2017-04-12  9

EDIT:

If input is file, use read_csv with parameters index_col and parse_dates for DatetimeIndex:

df = pd.read_csv(file, index_col=['date'], parse_dates=['date'])
like image 142
jezrael Avatar answered Sep 25 '22 01:09

jezrael