I got a DataFrame with 3 levels of Index and I need to reindex the third level without changing the first and the second level.
I have a DataFrame like this:
tuples = [('A', 'a', 1), ('A', 'a', 3), ('A', 'b', 3), ('B', 'c', 1), ('B', 'c', 2), ('B', 'c', 3), ('C', 'd', 2)]
idx = pd.MultiIndex.from_tuples(tuples, names=['first', 'second', 'third'])
df = pd.DataFrame(np.random.randn(7, 2), index=idx, columns=['col1', 'col2'])
col1 col2
first second third
A a 1 -0.999816 -0.599815
3 -0.277794 -0.453870
b 3 1.116561 0.760010
B c 1 1.018475 -0.667625
2 0.695997 0.641531
3 0.593724 0.265256
C d 2 1.133767 0.716083
And I would like a DataFrame like this:
col1 col2
first second third
A a 1 -0.999816 -0.599815
2 0 0
3 -0.277794 -0.453870
b 1 0 0
2 0 0
3 1.116561 0.760010
B c 1 1.018475 -0.667625
2 0.695997 0.641531
3 0.593724 0.265256
C d 1 0 0
2 1.133767 0.716083
3 0 0
I want the third index to be the same everywhere
Output: Now, the dataframe has Hierarchical Indexing or multi-indexing. To revert the index of the dataframe from multi-index to a single index using the Pandas inbuilt function reset_index(). Returns: (Data Frame or None) DataFrame with the new index or None if inplace=True.
One can reindex a single column or multiple columns by using reindex() method and by specifying the axis we want to reindex. Default values in the new index that are not present in the dataframe are assigned NaN.
Use DataFrame.unstack
working by default by last index of MultiIndex
with DataFrame.stack
:
df1 = df.unstack(fill_value=0).stack()
print (df1)
col1 col2
first second third
A a 1 -1.549363 -1.206828
2 0.000000 0.000000
3 0.445008 -0.173086
b 1 0.000000 0.000000
2 0.000000 0.000000
3 1.488947 -0.792520
B c 1 1.838997 -0.439362
2 1.160003 -0.577093
3 -1.031044 -0.838885
C d 1 0.000000 0.000000
2 0.316934 0.353254
3 0.000000 0.000000
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With