I created a Series
from a DataFrame
, when I resampled some data with a count like so: where H2
is a DataFrame
:
H3=H2[['SOLD_PRICE']] H5=H3.resample('Q',how='count') H6=pd.rolling_mean(H5,4)
This yielded a series that looks like this:
1999-03-31 SOLD_PRICE NaN 1999-06-30 SOLD_PRICE NaN 1999-09-30 SOLD_PRICE NaN 1999-12-31 SOLD_PRICE 3.00 2000-03-31 SOLD_PRICE 3.00
with an index that looks like:
MultiIndex [(1999-03-31 00:00:00, u'SOLD_PRICE'), (1999-06-30 00:00:00, u'SOLD_PRICE'), (1999-09-30 00:00:00, u'SOLD_PRICE'), (1999-12-31 00:00:00, u'SOLD_PRICE'),.....
I don't want the second column as an index. Ideally I'd have a DataFrame
with column 1 as "Date" and column 2 as "Sales" (dropping the second level of the index). I don't quite see how to reconfigure the index.
Reset index without new column By default, DataFrame. reset_index() adds the current row index as a new 'index' column in DataFrame. If we do not want to add the new column, we can use the drop parameter. If drop=True then it does not add the new column of the current row index in the DataFrame.
The reset_index() function is used to generate a new DataFrame or Series with the index reset. For a Series with a MultiIndex, only remove the specified levels from the index. Removes all levels by default. Just reset the index, without inserting it as a column in the new DataFrame.
Pandas with Python It is possible to specify or change the index labels of a pandas Series object after creation also. It can be done by using the index attribute of the pandas series constructor.
Just call reset_index()
:
In [130]: s Out[130]: 0 1 1999-03-31 SOLD_PRICE NaN 1999-06-30 SOLD_PRICE NaN 1999-09-30 SOLD_PRICE NaN 1999-12-31 SOLD_PRICE 3 2000-03-31 SOLD_PRICE 3 Name: 2, dtype: float64 In [131]: s.reset_index() Out[131]: 0 1 2 0 1999-03-31 SOLD_PRICE NaN 1 1999-06-30 SOLD_PRICE NaN 2 1999-09-30 SOLD_PRICE NaN 3 1999-12-31 SOLD_PRICE 3 4 2000-03-31 SOLD_PRICE 3
There are many ways to drop columns:
Call reset_index()
twice and specify a column:
In [136]: s.reset_index(0).reset_index(drop=True) Out[136]: 0 2 0 1999-03-31 NaN 1 1999-06-30 NaN 2 1999-09-30 NaN 3 1999-12-31 3 4 2000-03-31 3
Delete the column after resetting the index:
In [137]: df = s.reset_index() In [138]: df Out[138]: 0 1 2 0 1999-03-31 SOLD_PRICE NaN 1 1999-06-30 SOLD_PRICE NaN 2 1999-09-30 SOLD_PRICE NaN 3 1999-12-31 SOLD_PRICE 3 4 2000-03-31 SOLD_PRICE 3 In [139]: del df[1] In [140]: df Out[140]: 0 2 0 1999-03-31 NaN 1 1999-06-30 NaN 2 1999-09-30 NaN 3 1999-12-31 3 4 2000-03-31 3
Call drop()
after resetting:
In [144]: s.reset_index().drop(1, axis=1) Out[144]: 0 2 0 1999-03-31 NaN 1 1999-06-30 NaN 2 1999-09-30 NaN 3 1999-12-31 3 4 2000-03-31 3
Then, after you've reset your index, just rename the columns
In [146]: df.columns = ['Date', 'Sales'] In [147]: df Out[147]: Date Sales 0 1999-03-31 NaN 1 1999-06-30 NaN 2 1999-09-30 NaN 3 1999-12-31 3 4 2000-03-31 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With