I have a large dataframe from which I get the data I need with groupby
. I need to get several separate columns from the index of new dataframe.
Part of the original dataframe looks like this:
code place vl year week
0 111.0002.0056 region1 1 2017 29
1 112.6500.2285 region2 1 2017 31
2 112.5600.6325 region2 1 2017 30
3 112.5600.6325 region2 1 2017 30
4 112.5600.8159 region2 1 2017 30
5 111.0002.0056 region2 1 2017 29
6 111.0002.0056 region2 1 2017 30
7 111.0002.0056 region2 1 2017 28
8 112.5600.8159 region3 1 2017 31
9 112.5600.8159 region3 1 2017 28
10 111.0002.0114 region3 1 2017 31
....
After applying groupby
, it looks like this (code: df_test1 = df_test.groupby(['code' , 'year', 'week', 'place'])['vl'].sum().unstack(fill_value=0)
):
place region1 region2 region3 region4 index1
code year week
111.0002.0006 2017 29 0 3 0 0 (111.0002.0006, 2017, 29)
30 0 7 0 0 (111.0002.0006, 2017, 30)
111.0002.0018 2017 29 0 0 0 0 (111.0002.0018, 2017, 29)
111.0002.0029 2017 30 0 0 0 0 (111.0002.0029, 2017, 30)
111.0002.0055 2017 28 0 33 0 8 (111.0002.0055, 2017, 28)
29 1 155 2 41 (111.0002.0055, 2017, 29)
30 0 142 1 39 (111.0002.0055, 2017, 30)
31 0 31 0 13 (111.0002.0055, 2017, 31)
111.0002.0056 2017 28 9 36 0 4 (111.0002.0056, 2017, 28)
29 20 75 2 37 (111.0002.0056, 2017, 29)
30 17 81 2 33 (111.0002.0056, 2017, 30)
....
I save the index in a separate column index1
(code: df_test1['index1'] = df_test1.index
)
I need to get out of the column index1
three separate columns code
, year
and week
.
The result should look like this:
region1 region2 region3 region4 code year week
0 3 0 0 111.0002.0006 2017 29
0 7 0 0 111.0002.0006 2017 30
0 0 0 0 111.0002.0018 2017 29
0 0 0 0 111.0002.0029 2017 30
0 33 0 8 111.0002.0055 2017 28
1 155 2 41 111.0002.0055 2017 29
0 142 1 39 111.0002.0055 2017 30
0 31 0 13 111.0002.0055 2017 31
....
I would be grateful for any advice!
You add reset_index
instead df_test1['index1'] = df_test1.index
and for clean df
add rename_axis
- it remove column name place
:
df_test1 = df_test.groupby(['code' , 'year', 'week', 'place'])['vl'].sum() \
.unstack(fill_value=0) \
.reset_index() \
.rename_axis(None, axis=1)
print (df_test1)
code year week region1 region2 region3
0 111.0002.0056 2017 28 0 1 0
1 111.0002.0056 2017 29 1 1 0
2 111.0002.0056 2017 30 0 1 0
3 111.0002.0114 2017 31 0 0 1
4 112.5600.6325 2017 30 0 2 0
5 112.5600.8159 2017 28 0 0 1
6 112.5600.8159 2017 30 0 1 0
7 112.5600.8159 2017 31 0 0 1
8 112.6500.2285 2017 31 0 1 0
Last if necessary change ordering of columns:
#all cols are columns in df_test1
cols = ['code' , 'year', 'week']
df_test1 = df_test1[[x for x in df_test1.columns if x not in cols] + cols]
print (df_test1)
region1 region2 region3 code year week
0 0 1 0 111.0002.0056 2017 28
1 1 1 0 111.0002.0056 2017 29
2 0 1 0 111.0002.0056 2017 30
3 0 0 1 111.0002.0114 2017 31
4 0 2 0 112.5600.6325 2017 30
5 0 0 1 112.5600.8159 2017 28
6 0 1 0 112.5600.8159 2017 30
7 0 0 1 112.5600.8159 2017 31
8 0 1 0 112.6500.2285 2017 31
Or you can try this pd.crosstab
df=df.set_index(['code', 'year', 'week','vl'])
df=pd.crosstab(df.index,df.place).reset_index()
df[['code', 'year', 'week','vl']]=df['row_0'].apply(pd.Series).drop('row_0',axis=1)
Out[32]:
place region1 region2 region3 code year week vl
0 0 1 0 111.0002.0056 2017 28 1
1 1 1 0 111.0002.0056 2017 29 1
2 0 1 0 111.0002.0056 2017 30 1
3 0 0 1 111.0002.0114 2017 31 1
4 0 2 0 112.5600.6325 2017 30 1
5 0 0 1 112.5600.8159 2017 28 1
6 0 1 0 112.5600.8159 2017 30 1
7 0 0 1 112.5600.8159 2017 31 1
8 0 1 0 112.6500.2285 2017 31 1
You can skip creating index1
entirely and use the get_level_values(<index>)
method of your df_test1.index
. See https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.MultiIndex.get_level_values.html#pandas.MultiIndex.get_level_values The calls should look something like
df_test1['code'] = df_test1.index.get_level_values(0)
df_test1['year'] = df_test1.index.get_level_values(1)
df_test1['week'] = df_test1.index.get_level_values(2)
This should work no matter how you generated the MultiIndex - whether by groupby(), pivot_table(), or otherwise.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With