I have two pandas dataframes both holding irregular timeseries data.
I want merge/join the two frames by time.
I also want to forward fill the other columns of frame2 for any "new" rows that were added through the joining process. How can I do this?
I have tried:
df = pd.merge(df1, df2, on="DateTime")
but this just leave a frame with matching timestamp rows.
I would be grateful for any ideas!
Both join and merge can be used to combines two dataframes but the join method combines two dataframes on the basis of their indexes whereas the merge method is more versatile and allows us to specify columns beside the index to join on for both dataframes.
Joining two DataFrames can be done in multiple ways (left, right, and inner) depending on what data must be in the final DataFrame. to_csv can be used to write out DataFrames in CSV format.
The Fastest Ways As it turns out, join always tends to perform well, and merge will perform almost exactly the same given the syntax is optimal. Writing the query perfectly grants a whopping 20+% increase over our unindexed queries and a surprising 10% bump over our properly indexed but poorly written query.
Try this. The how='left'
will have the merge keep all records of df1, and the fillna
will populate missing values.
df = pd.merge(df1, df2, on='DateTime', how='left').fillna(method='ffill')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With