Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating an empty MultiIndex

I would like to create an empty DataFrame with a MultiIndex before assigning rows to it. I already found that empty DataFrames don't like to be assigned MultiIndexes on the fly, so I'm setting the MultiIndex names during creation. However, I don't want to assign levels, as this will be done later. This is the best code I got to so far:

def empty_multiindex(names):     """     Creates empty MultiIndex from a list of level names.     """     return MultiIndex.from_tuples(tuples=[(None,) * len(names)], names=names) 

Which gives me

In [2]:  empty_multiindex(['one','two', 'three'])  Out[2]:  MultiIndex(levels=[[], [], []],            labels=[[-1, -1, -1], [-1, -1, -1], [-1, -1, -1]],            names=[u'one', u'two', u'three']) 

and

In [3]: DataFrame(index=empty_multiindex(['one','two', 'three']))  Out[3]: one two three NaN NaN NaN 

Well, I have no use for these NaNs. I can easily drop them later, but this is obviously a hackish solution. Anyone has a better one?

like image 337
dmvianna Avatar asked Feb 03 '15 00:02

dmvianna


People also ask

How do I create a MultiIndex column in pandas?

pandas MultiIndex to ColumnsUse pandas DataFrame. reset_index() function to convert/transfer MultiIndex (multi-level index) indexes to columns. The default setting for the parameter is drop=False which will keep the index values as columns and set the new index to DataFrame starting from zero.

What is a MultiIndex DataFrame?

A MultiIndex (also known as a hierarchical index) DataFrame allows you to have multiple columns acting as a row identifier and multiple rows acting as a header identifier. With MultiIndex, you can do some sophisticated data analysis, especially for working with higher dimensional data.


1 Answers

The solution is to leave out the labels. This works fine for me:

>>> my_index = pd.MultiIndex(levels=[[],[],[]],                              labels=[[],[],[]],                              names=[u'one', u'two', u'three']) >>> my_index MultiIndex(levels=[[], [], []],            labels=[[], [], []],            names=[u'one', u'two', u'three']) >>> my_columns = [u'alpha', u'beta'] >>> df = pd.DataFrame(index=my_index, columns=my_columns) >>> df Empty DataFrame Columns: [alpha, beta] Index: [] >>> df.loc[('apple','banana','cherry'),:] = [0.1, 0.2] >>> df                     alpha beta one   two    three             apple banana cherry   0.1  0.2 

Hope that helps!

For Pandas Version >= 0.25.1: The keyword labels has been replaced with codes

like image 64
RoG Avatar answered Oct 06 '22 00:10

RoG