Merge multi-indexed with single-indexed data frames in pandas

People also ask

How do I convert MultiIndex to single index in pandas?

To revert the index of the dataframe from multi-index to a single index using the Pandas inbuilt function reset_index(). Returns: (Data Frame or None) DataFrame with the new index or None if inplace=True.

You could use get_level_values:

firsts = df1.index.get_level_values('first')
df1['value2'] = df2.loc[firsts].values

Note: you are almost doing a join here (except the df1 is MultiIndex)... so there may be a neater way to describe this...

In an example (similar to what you have):

df1 = pd.DataFrame([['a', 'x', 0.123], ['a','x', 0.234],
                    ['a', 'y', 0.451], ['b', 'x', 0.453]],
                   columns=['first', 'second', 'value1']
                   ).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10],['b', 20]],
                   columns=['first', 'value']).set_index(['first'])

firsts = df1.index.get_level_values('first')
df1['value2'] = df2.loc[firsts].values

In [5]: df1
Out[5]: 
              value1  value2
first second                
a     x        0.123      10
      x        0.234      10
      y        0.451      10
b     x        0.453      20

According to the documentation, as of pandas 0.14, you can simply join single-index and multiindex dataframes. It will match on the common index name. The how argument works as expected with 'inner' and 'outer', though interestingly it seems to be reversed for 'left' and 'right' (could this be a bug?).

df1 = pd.DataFrame([['a', 'x', 0.471780], ['a','y', 0.774908], ['a', 'z', 0.563634],
                    ['b', 'x', -0.353756], ['b', 'y', 0.368062], ['b', 'z', -1.721840],
                    ['c', 'x', 1], ['c', 'y', 2], ['c', 'z', 3],
                   ],
                   columns=['first', 'second', 'value1']
                   ).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10], ['b', 20]],
                   columns=['first', 'value2']).set_index(['first'])

print(df1.join(df2, how='inner'))
                value1  value2
first second                  
a     x       0.471780      10
      y       0.774908      10
      z       0.563634      10
b     x      -0.353756      20
      y       0.368062      20
      z      -1.721840      20

As the .ix syntax is a powerful shortcut to reindexing, but in this case you are actually not doing any combined rows/column reindexing, this can be done a bit more elegantly (for my humble taste buds) with just using reindexing:

Preparation from hayden:

df1 = pd.DataFrame([['a', 'x', 0.123], ['a','x', 0.234],
                    ['a', 'y', 0.451], ['b', 'x', 0.453]],
                   columns=['first', 'second', 'value1']
                   ).set_index(['first', 'second'])
df2 = pd.DataFrame([['a', 10],['b', 20]],
                   columns=['first', 'value']).set_index(['first'])

Then this looks like this in iPython:

In [4]: df1
Out[4]: 
              value1
first second        
a     x        0.123
      x        0.234
      y        0.451
b     x        0.453

In [5]: df2
Out[5]: 
       value
first       
a         10
b         20

In [7]: df2.reindex(df1.index, level=0)
Out[7]: 
              value
first second       
a     x          10
      x          10
      y          10
b     x          20

In [8]: df1['value2'] = df2.reindex(df1.index, level=0)

In [9]: df1
Out[9]: 
              value1  value2
first second                
a     x        0.123      10
      x        0.234      10
      y        0.451      10
b     x        0.453      20

The mnemotechnic for what level you have to use in the reindex method: It states for the level that you already covered in the bigger index. So, in this case df2 already had level 0 covered of the df1.index.

Related questions
                            
                                Fastest way to parse large CSV files in Pandas
                            
                                What is the difference between random.normalvariate() and random.gauss() in python?
                            
                                What's the difference between dtype and converters in pandas.read_csv?
                            
                                Python - TypeError - TypeError: '<' not supported between instances of 'NoneType' and 'int'
                            
                                Installing pygraphviz on Windows 10 64-bit, Python 3.6
                            
                                How do I get PyCharm to show entire error diffs from pytest?
                            
                                Why are main runnable Python scripts not compiled to pyc files like modules? [duplicate]
                            
                                How can I access s3 files in Python using urls?
                            
                                confused about random_state in decision tree of scikit learn
                            
                                How to explain the reverse of a sequence by slice notation a[::-1]
                            
                                Subclassing dict: should dict.__init__() be called?
                            
                                right way to run some code with timeout in Python
                            
                                How do I install Python 2.7.3 32 bit and 64 bit on Windows side by side
                            
                                How to implement user_loader callback in Flask-Login
                            
                                plt.show() making terminal hang
                            
                                Numpy.dot TypeError: Cannot cast array data from dtype('float64') to dtype('S32') according to the rule 'safe'
                            
                                Can variable names in Python start with an integer?
                            
                                Async generator is not an iterator?
                            
                                Visualizing your code's architecture
                            
                                What (pure) Python library to use for AES 256 encryption? [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Merge multi-indexed with single-indexed data frames in pandas

Tags:

python

pandas

People also ask

Recent Activity

Donate For Us