Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Setting multilevel index within pandas pivot table

I have the following data

,DISTRICT,STATUS,DEM,REP,CON,GRE,WOR,IND,WEP,REF,OTH,BLANK
3,1,Active,144470,162308,11778,1108,2066,23245,150.0,41.0,394,128547
4,1,Inactive,10086,8290,643,96,130,1545,5.0,3.0,28,8436
12,2,Active,159400,155996,8457,830,1712,18008,137.0,43.0,291,114789
13,2,Inactive,9440,7752,428,61,119,1043,6.0,3.0,13,6832
24,3,Active,188902,148599,5872,832,1193,19370,144.0,115.0,228,127907
25,3,Inactive,12367,8835,305,68,69,1255,3.0,4.0,16,8315
30,4,Active,204544,160861,4911,888,1286,17455,148.0,173.0,162,113486
31,4,Inactive,15085,10302,315,74,119,1269,4.0,2.0,6,8442
39,5,Active,298385,33244,1468,483,1251,8662,175.0,27.0,43,62106
40,5,Inactive,20071,2273,88,30,126,643,9.0,3.0,2,4380
45,6,Active,198024,52511,1951,592,849,8905,161.0,38.0,83,87250
46,6,Inactive,15016,3699,145,35,77,789,4.0,0.0,9,6769
57,7,Active,271911,22736,803,899,1369,6971,203.0,22.0,113,67437
58,7,Inactive,31143,3355,133,155,219,1166,10.0,1.0,20,9357
66,8,Active,324586,32846,1122,713,1666,8649,202.0,25.0,54,62021
67,8,Inactive,36652,3183,141,109,271,1245,11.0,2.0,20,8419
72,9,Active,313949,30234,1034,823,1474,7952,139.0,21.0,90,62772
73,9,Inactive,33517,3184,119,115,257,1160,8.0,1.0,11,8696
81,10,Active,247623,45926,1128,743,570,9287,128.0,24.0,152,79918
82,10,Inactive,38988,9685,177,143,114,2749,9.0,0.0,57,16564
90,11,Active,183564,110062,5299,654,1362,12817,168.0,67.0,137,88130
91,11,Inactive,16846,7921,409,69,187,1259,5.0,2.0,19,8107
102,12,Active,259877,54431,883,1009,622,13703,128.0,32.0,210,91923
103,12,Inactive,44238,12638,200,244,136,3629,10.0,2.0,87,20458
111,13,Active,327895,16808,738,840,1235,7730,238.0,37.0,82,54421
112,13,Inactive,50766,3815,154,167,304,1816,18.0,1.0,27,11201
120,14,Active,214570,33020,1694,650,993,7544,200.0,28.0,73,61960
121,14,Inactive,21175,3395,241,70,145,946,9.0,0.0,17,6736
126,15,Active,282788,13663,1069,426,1509,5539,284.0,37.0,23,46081
127,15,Inactive,44336,2904,203,53,393,1077,28.0,1.0,12,8909
135,16,Active,251841,62717,4039,584,1140,12216,153.0,72.0,92,76433
136,16,Inactive,29544,6369,369,54,181,1430,6.0,0.0,16,9342
144,17,Active,201872,102969,7728,807,1393,18013,131.0,206.0,97,106682
145,17,Inactive,16792,8085,520,110,134,1658,3.0,3.0,13,10190
159,18,Active,154331,135552,8585,1126,1764,22973,157.0,104.0,350,105443
160,18,Inactive,12972,9506,574,133,196,1995,9.0,4.0,39,9418
195,19,Active,141289,138473,10576,1867,2155,26423,167.0,85.0,360,114937
196,19,Inactive,11503,8552,656,209,245,2350,9.0,4.0,37,10392
213,20,Active,178650,107575,10258,1452,2098,25382,186.0,99.0,295,107500
214,20,Inactive,17920,7738,768,189,353,2570,25.0,1.0,75,11487
252,21,Active,117878,166307,5996,1243,1589,24589,96.0,65.0,201,83284
253,21,Inactive,10560,11493,502,175,245,2646,8.0,8.0,25,10898
279,22,Active,127765,158621,6587,1179,1818,23179,139.0,90.0,377,80746
280,22,Inactive,11143,9472,478,144,262,2183,23.0,11.0,49,8998
315,23,Active,128565,153949,7342,1452,1986,21646,136.0,93.0,357,84173
316,23,Inactive,11331,8808,510,196,232,2032,14.0,8.0,44,9399
330,24,Active,143319,138353,8370,1394,1844,21997,147.0,55.0,521,103922
331,24,Inactive,15962,9745,642,211,325,2496,16.0,6.0,74,12229
336,25,Active,178181,120433,7400,1180,1404,19377,163.0,43.0,607,103153
337,25,Inactive,15180,6838,404,158,146,1533,19.0,3.0,61,8252
345,26,Active,234929,90506,7508,1351,2372,19155,192.0,87.0,444,74621
346,26,Inactive,19306,5621,405,155,284,1602,22.0,7.0,47,8109
372,27,Active,142703,183641,12825,1337,2299,25525,102.0,73.0,627,97801
373,27,Inactive,7283,7959,550,102,165,1433,8.0,0.0,42,6377

and can be viewed and downloaded here.

I am trying to pivot this dataframe such that 'DISTRICT' is the first level of the index and the 'BLANK' 'CON', 'DEM', 'GRE', 'IND', 'OTH', 'REF', 'REP', 'WEP', 'WOR' columns are the second level of a Multilevel index. I then wanted to set the 'STATUS' column to be horizontal columns 'Active' and 'Inactive'. I wrote the following code:

active_inactive8.pivot(index=['DISTRICT', 'DEM', 'REP', 'CON', 'GRE',
                              'WOR', 'IND', 'WEP', 'REF', 'OTH', 'BLANK'],
                       columns='STATUS')

but get the error: ValueError: all arrays must be same length.

I have also tried .melt without success.

like image 854
thedatasleuth Avatar asked Dec 06 '25 04:12

thedatasleuth


1 Answers

I believe need set_index with unstack, but then necesary transpose, swaplevel and last sort_index:

df = pd.read_csv('active_inactive8.csv', index_col=0)

df2 = df.set_index(['STATUS','DISTRICT']).unstack().T.swaplevel(1,0).sort_index()

df2 = df.pivot(index='STATUS', columns='DISTRICT').T.swaplevel(1,0).sort_index()


print (df2.head(10))
STATUS            Active  Inactive
DISTRICT                          
1        BLANK  128547.0    8436.0
         CON     11778.0     643.0
         DEM    144470.0   10086.0
         GRE      1108.0      96.0
         IND     23245.0    1545.0
         OTH       394.0      28.0
         REF        41.0       3.0
         REP    162308.0    8290.0
         WEP       150.0       5.0
         WOR      2066.0     130.0
like image 67
jezrael Avatar answered Dec 08 '25 17:12

jezrael



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!