I'm trying to merge a dataframe (df1
) with another dataframe (df2
) for which df2
can potentially be empty. The merge condition is df1.index=df2.z
(df1
is never empty), but I'm getting the following error.
Is there any way to get this working?
In [31]:
import pandas as pd
In [32]:
df1 = pd.DataFrame({'a': [1, 2, 3], 'b': [4, 5, 6], 'c': [1, 2, 3]})
df2 = pd.DataFrame({'x':[], 'y':[], 'z':[]})
dfm = pd.merge(df1, df2, how='outer', left_index=True, right_on='z')
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-34-4e9943198dae> in <module>()
----> 1 dfmb = pd.merge(df1, df2, how='outer', left_index=True, right_on='z')
/usr/local/lib/python2.7/dist-packages/pandas/tools/merge.pyc in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy)
37 right_index=right_index, sort=sort, suffixes=suffixes,
38 copy=copy)
---> 39 return op.get_result()
40 if __debug__:
41 merge.__doc__ = _merge_doc % '\nleft : DataFrame'
/usr/local/lib/python2.7/dist-packages/pandas/tools/merge.pyc in get_result(self)
185
186 def get_result(self):
--> 187 join_index, left_indexer, right_indexer = self._get_join_info()
188
189 ldata, rdata = self.left._data, self.right._data
/usr/local/lib/python2.7/dist-packages/pandas/tools/merge.pyc in _get_join_info(self)
277 join_index = self.left.index.take(left_indexer)
278 elif self.left_index:
--> 279 join_index = self.right.index.take(right_indexer)
280 else:
281 join_index = Index(np.arange(len(left_indexer)))
/usr/local/lib/python2.7/dist-packages/pandas/core/index.pyc in take(self, indexer, axis)
981
982 indexer = com._ensure_platform_int(indexer)
--> 983 taken = np.array(self).take(indexer)
984
985 # by definition cannot propogate freq
IndexError: cannot do a non-empty take from an empty axes.
merge(df1, df2, how='outer', left_index=True, right_on='z') /usr/local/lib/python2. 7/dist-packages/pandas/tools/merge. pyc in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy) 37 right_index=right_index, sort=sort, suffixes=suffixes, 38 copy=copy) ---> 39 return op.
Append Data to an Empty Pandas Dataframe loc , we can also use the . append() method to add rows. The . append() method works by, well, appending a dataframe to another dataframe.
Pandas DataFrame merge() Method The merge() method updates the content of two DataFrame by merging them together, using the specified method(s). Use the parameters to control which values to keep and which to replace.
The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.
try:
dfm = pd.merge(df1, df2, how='outer', left_index=True, right_on='z')
except IndexError:
dfm = df1 if not df1.empty else df2
might be sufficient for your needs
Another alternative, similar to Joran's:
try:
dfm = pd.merge(df1, df2, how='outer', left_index=True, right_on='z')
except IndexError:
dfm = df1.reindex_axis(df1.columns.union(df2.columns), axis=1)
I'm not sure which is clearer but both the following work:
In [11]: df1.reindex_axis(df1.columns.union(df2.columns), axis=1)
Out[11]:
a b c x y z
0 1 4 1 NaN NaN NaN
1 2 5 2 NaN NaN NaN
2 3 6 3 NaN NaN NaN
In [12]: df1.loc[:, df1.columns.union(df2.columns)]
Out[12]:
a b c x y z
0 1 4 1 NaN NaN NaN
1 2 5 2 NaN NaN NaN
2 3 6 3 NaN NaN NaN
(I prefer the former.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With