I don't know if the question has been answered again, but I haven't found anything similar
I have a MultiIndex DataFrame with two level columns, for example:
arrays = [np.array(['bar', 'bar','bar', 'foo', 'foo','foo', 'qux', 'qux', 'qux']),
np.array(['one', 'two', 'three', 'one', 'two', 'three', 'one', 'two','three'])]
df = pd.DataFrame(np.random.randn(3, 9), columns=arrays)
print(df)
bar foo qux \
one two three one two three one
0 1.255724 -0.692387 -1.485324 2.265736 0.494645 1.973369 -0.326260
1 -0.903874 0.695460 -0.950076 0.181590 -2.345611 1.288061 0.980166
2 -0.294882 1.034745 1.423288 -0.895625 -0.847338 0.470444 0.373579
two three
0 0.136427 -0.136479
1 0.702732 -1.894376
2 0.506240 -0.456519
I want to select specific columns from the second level for every first level column independently.
For example, i want to get as result something like this:
bar foo qux
one two two one three
0 1.255724 -0.692387 0.494645 -0.326260 -0.136479
1 -0.903874 0.695460 -2.345611 0.980166 -1.894376
2 -0.294882 1.034745 -0.847338 0.373579 -0.456519
I have seen this questions but it isn't what i want to achieve.
Now I am doing it like this:
level0 = ['bar','foo','qux']
level1 = [['one','two'],['two'],['one','three']]
df_list=[]
for i,value in enumerate(level0):
df_list.append(df.loc[:,(value,level1[i])])
new_df = pd.concat([i for i in df_list],axis=1)
print(new_df)
But it doesn't seem to me as the best solution.
Is there any better (more "pandas") approach to solve this?
Instead of concatenating the data, you can pick out the columns first, and use column extraction;
cols = pd.concat([pd.DataFrame({'level_0':x, 'level_1':y})
for x,y in zip(level0,level1)]
).values
df[cols]
Output:
bar foo qux
one two two one three
0 0.729061 -0.876547 0.312557 0.736568 0.250469
1 0.619194 0.451023 0.803252 -1.636403 -0.854607
2 0.254690 -1.054859 -1.223274 0.398411 -1.448396
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With