I have a panda DataFrame with date_time/voltage data like this (df1):
Date_Time Chan
0 20130401 9:00 AAT
1 20130401 10:00 AAT
2 20130401 11:00 AAT
3 20130401 12:00 AAT
4 20130401 13:00 AAT
5 20130401 14:00 AAT
6 20130401 15:00 AAT
I am using this as a prototype to load in data from a much bigger data file and create one DataFrame . The other DataFrame looks like this (df2):
Chan date_time Sens1 Sens2
AAC 01-Apr-2013 09:00 5.17 1281
AAC 01-Apr-2013 10:00 5.01 500
AAC 01-Apr-2013 12:00 5.17 100
AAC 01-Apr-2013 13:00 5.19 41997
AAC 01-Apr-2013 16:00 5.21 2123
AAT 01-Apr-2013 09:00 28.82 300
AAT 01-Apr-2013 10:00 28.35 4900
AAT 01-Apr-2013 12:00 28.04 250
AAE 01-Apr-2013 11:00 3.36 400
AAE 01-Apr-2013 12:00 3.41 200
AAE 01-Apr-2013 13:00 3.40 2388
AAE 01-Apr-2013 14:00 3.37 300
AAE 01-Apr-2013 15:00 3.35 500
AXN 01-Apr-2013 09:00 23.96 6643
AXN 01-Apr-2013 10:00 24.03 1000
AXW 01-Apr-2013 11:00 46.44 2343
So what I want to do is search df2 for all instances of a match from both columns of df1 (noting the different data formats) and insert the data from df2 into df1. Like this (df1)
Date_Time Chan Sens1 Sens2
0 20130401 9:00 AAT 28.82 300
1 20130401 10:00 AAT 28.35 4900
2 20130401 11:00 AAT NaN NaN
3 20130401 12:00 AAT 28.04 250
4 20130401 13:00 AAT NaN NaN
5 20130401 14:00 AAT NaN NaN
6 20130401 15:00 AAT NaN NaN
Could you give me some suggestions for the python/pandas code to match this psuedocode:
if (df1['date_time'] = df2['date_time']) & (df1['Chan'] = df2['Chan'])):
df1['Sens1'] = df2['Sens1']
df1['Sens2'] = df2['Sens2']
If it effects the answer, it is my intention to bfill and ffill the NaNs and then add this DataFrame to a Panel and then repeat with another channel name in place of AAT.
By using equals() function we can directly check if df1 is equal to df2. This function is used to determine if two dataframe objects in consideration are equal or not. Unlike dataframe. eq() method, the result of the operation is a scalar boolean value indicating if the dataframe objects are equal or not.
To find the positions of two matching columns, we first initialize a pandas dataframe with two columns of city names. Then we use where() of numpy to compare the values of two columns. This returns an array that represents the indices where the two columns have the same value.
You can use a plain ol' merge to do this. But first, you should do a little cleanup of you DataFrames, to make sure your datetime columns are actually datetimes rather than strings (Note: it may be better to do this when reading as csv or whatever):
df1['Date_Time'] = pd.to_datetime(df1['Date_Time'], format='%Y%m%d %H:%M')
df2['date_time'] = pd.to_datetime(df2['date_time'])
Let's also rename the Datetime columns with the same name:
df1.rename(columns={'Date_Time': 'Datetime'}, inplace=True)
df2.rename(columns={'date_time': 'Datetime'}, inplace=True)
Now a simple merge will give you what you're after:
In [11]: df1.merge(df2)
Out[11]:
Datetime Chan Sens1 Sens2
0 2013-04-01 09:00:00 AAT 28.82 300
1 2013-04-01 10:00:00 AAT 28.35 4900
2 2013-04-01 12:00:00 AAT 28.04 250
In [12]: df1.merge(df2, how='left')
Out[12]:
Datetime Chan Sens1 Sens2
0 2013-04-01 09:00:00 AAT 28.82 300
1 2013-04-01 10:00:00 AAT 28.35 4900
2 2013-04-01 11:00:00 AAT NaN NaN
3 2013-04-01 12:00:00 AAT 28.04 250
4 2013-04-01 13:00:00 AAT NaN NaN
5 2013-04-01 14:00:00 AAT NaN NaN
6 2013-04-01 15:00:00 AAT NaN NaN
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With