I have one massive pandas dataframe with this structure: <pre class="prettyprint"><code>df1: A B 0 0 12 1 0 15 2 0 17 3 0 18 4 1 45 5 1 78 6 1 96 7 1 32 8 2 45 9 2 78 10 2 44 11 2 10 </code></pre> And a second one, smaller like this: <pre class="prettyprint"><code>df2 G H 0 0 15 1 1 45 2 2 31 </code></pre> I want to add a column to my first dataframe following this rule: <code>column df1.C = df2.H when df1.A == df2.G</code> I manage to do it with for loops, but the database is massive and the code run really slowly so I am looking for a Pandas-way or numpy to do it. Many thanks, Boris

If you only want to match mutual rows in both dataframes: <pre class="prettyprint"><code>import pandas as pd df1 = pd.DataFrame({'Name':['Sara'],'Special ability':['Walk on water']}) df1 Name Special ability 0 Sara Walk on water df2 = pd.DataFrame({'Name':['Sara', 'Gustaf', 'Patrik'],'Age':[4,12,11]}) df2 Name Age 0 Sara 4 1 Gustaf 12 2 Patrik 11 df = df2.merge(o, left_on='Name', right_on='Name', how='left') df Name Age Special ability 0 Sara 4 NaN 1 Gustaf 12 Walk on water 2 Patrik 11 NaN </code></pre> This Can allso be done with more than one matching argument: (In this example Patrik from df1 does not exist in df2 becuse they have different ages and therfore will not merge) <pre class="prettyprint"><code>df1 = pd.DataFrame({'Name':['Sara','Patrik'],'Special ability':['Walk on water','FireBalls'],'Age':[12,83]}) df1 Name Special ability Age 0 Sara Walk on water 12 1 Patrik FireBalls 83 df2 = pd.DataFrame({'Name':['Sara', 'Gustaf', 'Patrik'],'Age':[4,12,11]}) df2 Name Age 0 Sara 4 1 Gustaf 12 2 Patrik 11 df = df2.merge(df1,left_on=['Name','Age'],right_on=['Name','Age'],how='left') df Name Age Special ability 0 Sara 12 Walk on water 1 Gustaf 12 NaN 2 Patrik 11 NaN </code></pre>

Compare two pandas dataframe with different size

Tags:

python

pandas

numpy

I have one massive pandas dataframe with this structure:

And a second one, smaller like this:

I want to add a column to my first dataframe following this rule: column df1.C = df2.H when df1.A == df2.G

I manage to do it with for loops, but the database is massive and the code run really slowly so I am looking for a Pandas-way or numpy to do it.

Many thanks,

Boris

280

asked Jun 07 '17 14:06

boris

2 Answers

If you only want to match mutual rows in both dataframes:

import pandas as pd

df1 = pd.DataFrame({'Name':['Sara'],'Special ability':['Walk on water']})
df1    
   Name Special ability
0  Sara   Walk on water

df2 = pd.DataFrame({'Name':['Sara', 'Gustaf', 'Patrik'],'Age':[4,12,11]})
df2
     Name  Age
0    Sara    4
1  Gustaf   12
2  Patrik   11

df = df2.merge(o, left_on='Name', right_on='Name', how='left')
df
     Name  Age Special ability
0    Sara    4             NaN
1  Gustaf   12   Walk on water
2  Patrik   11             NaN

This Can allso be done with more than one matching argument: (In this example Patrik from df1 does not exist in df2 becuse they have different ages and therfore will not merge)

df1 = pd.DataFrame({'Name':['Sara','Patrik'],'Special ability':['Walk on water','FireBalls'],'Age':[12,83]})

df1
     Name Special ability  Age
0    Sara   Walk on water   12
1  Patrik       FireBalls   83

df2 = pd.DataFrame({'Name':['Sara', 'Gustaf', 'Patrik'],'Age':[4,12,11]})
df2
     Name  Age
0    Sara    4
1  Gustaf   12
2  Patrik   11

df = df2.merge(df1,left_on=['Name','Age'],right_on=['Name','Age'],how='left')
df
     Name  Age Special ability
0    Sara   12   Walk on water
1  Gustaf   12             NaN
2  Patrik   11             NaN

103

answered Sep 18 '22 00:09

Frans Sjöström

You probably want to use a merge:

df=df1.merge(df2,left_on="A",right_on="G")

will give you a dataframe with 3 columns, but the third one's name will be H

df.columns=["A","B","C"]

will then give you the column names you want

answered Sep 21 '22 00:09

WNG

Related questions
                            
                                Windows Python interpreter exits on Ctrl+C
                            
                                slqlalchemy UniqueConstraint VS Index(unique=True)
                            
                                AttributeError: module 'tensorflow.python.summary.summary' has no attribute 'FileWriter'
                            
                                How can I count across several relationships in django
                            
                                Using multiple custom classes with Pipeline sklearn (Python)
                            
                                A third stacked bar in matplotlib
                            
                                Filter list of strings using list comprehension
                            
                                Calculate DataFrame values recursively
                            
                                How to dynamically schedule tasks in Django?
                            
                                Creating `input_fn` from iterator
                            
                                Suppress Ansible Ad Hoc Warning
                            
                                Python Kivy: hide virtual keyboard in Text Input Field
                            
                                Importing classes/functions with same name as module
                            
                                HelpFormatter in Click
                            
                                No encoding declared
                            
                                PyQt - Add right click to a widget
                            
                                Importing a python script/module that uses argparse into another python script
                            
                                Replicate part of production django database to local or staging
                            
                                Flask SQLAlchemy enum field default value
                            
                                disabling one of the options in WTForms SelectField

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With