Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python pandas creating an uneven multiindex

I have the following code,

IDX_VALS_BANKNOTER_PATRIMONY = [['PATRIMONY'],['GOLD']]
IDX_VALS_BANKNOTER_ASSETS = [['ASSETS'],['DEPOSITS', 'ADVANCES']]
IDX_VALS_BANKNOTER_LIABILITIES = [['LIABILITIES'], ['CLIENTS', 'SUPPLIERS']]

IDX_BANKNOTER_PATRIMONY = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_PATRIMONY)
IDX_BANKNOTER_ASSETS = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_ASSETS)
IDX_BANKNOTER_LIABILITIES = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_LIABILITIES)

IDX_BANKNOTER = IDX_BANKNOTER_PATRIMONY.append(IDX_BANKNOTER_ASSETS).append(IDX_BANKNOTER_LIABILITIES)

print(IDX_BANKNOTER)

which prints the following index:

MultiIndex([(  'PATRIMONY',      'GOLD'),
            (     'ASSETS',  'DEPOSITS'),
            (     'ASSETS',  'ADVANCES'),
            ('LIABILITIES',   'CLIENTS'),
            ('LIABILITIES', 'SUPPLIERS')],
           )

(I used .from_product() because I hope to eventually add more labels) My question is the following: I want to extend this multiindex on a third column, so that I get a multiindex that looks like:

'PATRIMONY', 'GOLD',
'ASSETS', 'DEPOSITS',
'ASSETS', 'ADVANCES',
'LIABILITIES', 'CLIENTS', 'Dr. Foo'
'LIABILITIES', 'CLIENTS', 'Dr. House'
'LIABILITIES', 'CLIENTS', 'Richard'
'LIABILITIES', 'SUPPLIERS', 'PORT1',
'LIABILITIES', 'SUPPLIERS', 'PORT2'

which would mean that the multiindex would be uneven, with a third level only used by 'LIABILITIES', and distinct indexes for CLIENTS and SUPPLIERS, according to the client name or supplier name. I have tried appending the following indexes:

IDX_FIRST_EXTENSION_NAMES = [['LIABILITIES'], ['CLIENTS'], ['Dr. Foo', 'Dr. House', 'Richard']]
IDX_FIRST_EXTENSION = pd.MultiIndex.from_product(IDX_FIRST_EXTENSION_NAMES)
IDX_SECOND_EXTENSION_NAMES = [['LIABILITIES'], ['SUPPLIERS'], ['PORT1', 'PORT2']]
IDX_SECOND_EXTENSION = pd.MultiIndex.from_product(IDX_SECOND_EXTENSION_NAMES)
DESIRED_RESULT = IDX_BANKNOTER.append(IDX_FIRST_EXTENSION).append(IDX_SECOND_EXTENSION)

but all I get in return is:

MultiIndex([(  'PATRIMONY',      'GOLD'),
            (     'ASSETS',  'DEPOSITS'),
            (     'ASSETS',  'ADVANCES'),
            ('LIABILITIES',   'CLIENTS'),
            ('LIABILITIES',   'CLIENTS'),
            ('LIABILITIES',   'CLIENTS'),
            ('LIABILITIES', 'SUPPLIERS'),
            ('LIABILITIES', 'SUPPLIERS')],
           )

I am fairly new to using pandas, and the documentation on multiindexes hasn't been helpful (it has a fairly limited number of examples for initializing multiindexes, and no example of an uneven multiindex). Does anyone have pointers? I am making this multiindex for easy manipulation of the corresponding data, being able for example to access a specific client account with

df['LIABILITIES']['CLIENTS']['(CLIENT NAME)']

or to be able to get the sum of all values under ['CLIENTS']. I would ideally like to keep the columns of the dataframe for time labels.

Any help is appreciated, thank you.

like image 261
shintuku Avatar asked Nov 06 '22 06:11

shintuku


1 Answers

code:

import pandas as pd

IDX_VALS_BANKNOTER_PATRIMONY = [['PATRIMONY'],['GOLD'], ['']]
IDX_VALS_BANKNOTER_ASSETS = [['ASSETS'],['DEPOSITS', 'ADVANCES'], ['']]

IDX_BANKNOTER_PATRIMONY = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_PATRIMONY)
IDX_BANKNOTER_ASSETS = pd.MultiIndex.from_product(IDX_VALS_BANKNOTER_ASSETS)

IDX_BANKNOTER = IDX_BANKNOTER_PATRIMONY.append(IDX_BANKNOTER_ASSETS)

IDX_FIRST_EXTENSION_NAMES = [['LIABILITIES'], ['CLIENTS'], ['Dr. Foo', 'Dr. House', 'Richard']]
IDX_FIRST_EXTENSION = pd.MultiIndex.from_product(IDX_FIRST_EXTENSION_NAMES)
IDX_SECOND_EXTENSION_NAMES = [['LIABILITIES'], ['SUPPLIERS'], ['PORT1', 'PORT2']]
IDX_SECOND_EXTENSION = pd.MultiIndex.from_product(IDX_SECOND_EXTENSION_NAMES)
WANTED_RESULT = IDX_BANKNOTER.append(IDX_FIRST_EXTENSION).append(IDX_SECOND_EXTENSION)

print(WANTED_RESULT)

output:

MultiIndex([(  'PATRIMONY',      'GOLD',          ''),
            (     'ASSETS',  'DEPOSITS',          ''),
            (     'ASSETS',  'ADVANCES',          ''),
            ('LIABILITIES',   'CLIENTS',   'Dr. Foo'),
            ('LIABILITIES',   'CLIENTS', 'Dr. House'),
            ('LIABILITIES',   'CLIENTS',   'Richard'),
            ('LIABILITIES', 'SUPPLIERS',     'PORT1'),
            ('LIABILITIES', 'SUPPLIERS',     'PORT2')],
           )
like image 146
BeamRazor Avatar answered Nov 15 '22 10:11

BeamRazor