Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas reset index on series to remove multiindex

Tags:

python

pandas

I created a Series from a DataFrame, when I resampled some data with a count like so: where H2 is a DataFrame:

H3=H2[['SOLD_PRICE']] H5=H3.resample('Q',how='count') H6=pd.rolling_mean(H5,4) 

This yielded a series that looks like this:

1999-03-31  SOLD_PRICE     NaN 1999-06-30  SOLD_PRICE     NaN 1999-09-30  SOLD_PRICE     NaN 1999-12-31  SOLD_PRICE    3.00 2000-03-31  SOLD_PRICE    3.00 

with an index that looks like:

MultiIndex [(1999-03-31 00:00:00, u'SOLD_PRICE'), (1999-06-30 00:00:00, u'SOLD_PRICE'), (1999-09-30 00:00:00, u'SOLD_PRICE'), (1999-12-31 00:00:00, u'SOLD_PRICE'),..... 

I don't want the second column as an index. Ideally I'd have a DataFrame with column 1 as "Date" and column 2 as "Sales" (dropping the second level of the index). I don't quite see how to reconfigure the index.

like image 621
dartdog Avatar asked Sep 04 '13 21:09

dartdog


People also ask

How do I reset index without creating new column?

Reset index without new column By default, DataFrame. reset_index() adds the current row index as a new 'index' column in DataFrame. If we do not want to add the new column, we can use the drop parameter. If drop=True then it does not add the new column of the current row index in the DataFrame.

What does resetting the index do?

The reset_index() function is used to generate a new DataFrame or Series with the index reset. For a Series with a MultiIndex, only remove the specified levels from the index. Removes all levels by default. Just reset the index, without inserting it as a column in the new DataFrame.

How can you change the index of a panda series?

Pandas with Python It is possible to specify or change the index labels of a pandas Series object after creation also. It can be done by using the index attribute of the pandas series constructor.


1 Answers

Just call reset_index():

In [130]: s Out[130]: 0           1 1999-03-31  SOLD_PRICE   NaN 1999-06-30  SOLD_PRICE   NaN 1999-09-30  SOLD_PRICE   NaN 1999-12-31  SOLD_PRICE     3 2000-03-31  SOLD_PRICE     3 Name: 2, dtype: float64  In [131]: s.reset_index() Out[131]:             0           1   2 0  1999-03-31  SOLD_PRICE NaN 1  1999-06-30  SOLD_PRICE NaN 2  1999-09-30  SOLD_PRICE NaN 3  1999-12-31  SOLD_PRICE   3 4  2000-03-31  SOLD_PRICE   3 

There are many ways to drop columns:

Call reset_index() twice and specify a column:

In [136]: s.reset_index(0).reset_index(drop=True) Out[136]:             0   2 0  1999-03-31 NaN 1  1999-06-30 NaN 2  1999-09-30 NaN 3  1999-12-31   3 4  2000-03-31   3 

Delete the column after resetting the index:

In [137]: df = s.reset_index()  In [138]: df Out[138]:             0           1   2 0  1999-03-31  SOLD_PRICE NaN 1  1999-06-30  SOLD_PRICE NaN 2  1999-09-30  SOLD_PRICE NaN 3  1999-12-31  SOLD_PRICE   3 4  2000-03-31  SOLD_PRICE   3  In [139]: del df[1]  In [140]: df Out[140]:             0   2 0  1999-03-31 NaN 1  1999-06-30 NaN 2  1999-09-30 NaN 3  1999-12-31   3 4  2000-03-31   3 

Call drop() after resetting:

In [144]: s.reset_index().drop(1, axis=1) Out[144]:             0   2 0  1999-03-31 NaN 1  1999-06-30 NaN 2  1999-09-30 NaN 3  1999-12-31   3 4  2000-03-31   3 

Then, after you've reset your index, just rename the columns

In [146]: df.columns = ['Date', 'Sales']  In [147]: df Out[147]:          Date  Sales 0  1999-03-31    NaN 1  1999-06-30    NaN 2  1999-09-30    NaN 3  1999-12-31      3 4  2000-03-31      3 
like image 178
Phillip Cloud Avatar answered Sep 28 '22 05:09

Phillip Cloud