If I have two dataframes (or series) that are already sorted on compatible keys, I'd like to be able to cheaply merge them together and maintain sortedness. I can't see a way to do that other than via concat() and explicit sort()
a = pd.DataFrame([0,1,2,3], index=[1,2,3,5], columns=['x'])
b = pd.DataFrame([4,5,6,7], index=[0,1,4,6], columns=['x'])
print pd.concat([a,b])
print pd.concat([a,b]).sort()
x
1 0
2 1
3 2
5 3
0 4
1 5
4 6
6 7
x
0 4
1 0
1 5
2 1
3 2
4 6
5 3
6 7
It looks like there has been a bit of related discussion with numpy arrays, suggesting an 'interleave' method, but I haven't found a good answer.
Merge can be used in cases where both the left and right columns are not unique, and therefore cannot be an index. A merge is also just as efficient as a join as long as: Merging is done on indexes if possible.
As you can see, the merge is faster than joins, though it is small value, but over 4000 iterations, that small value becomes a huge number, in minutes.
It can be done using the merge() method. Below are some examples that depict how to merge data frames of different lengths using the above method: Example 1: Below is a program to merge two student data frames of different lengths.
If we limit the problem to a
and b
having only one column, then I would go through this path:
s = a.merge(b, how='outer', left_index=True, right_index=True)
s.stack().reset_index(level=1, drop=True)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With