I'm looking to merge two pandas DataFrames based on date. The issue is the 2nd dataframe does not include every date from the 1st dataframe. I need to use every date from df1 with the latest value from df2.
+-------------+---------------+-------------+
| DataFrame 1 |               |             |
+-------------+---------------+-------------+
| Date        |  Sales loc1   |  Sales loc2 |
| 1/1/17      |  100          |  95         |
| 1/2/17      |  125          |  124        |
| 1/3/17      |  115          |  152        |
| ...         |               |             |
| 2/1/17      |  110          |  111        |
+-------------+---------------+-------------+
+-------------+---------+------+
| DataFrame 2 |         |      |
+-------------+---------+------+
| Date        |  exp    |  loc |
| 1/1/17      |  100    |  1   |
| 1/1/17      |  125    |  2   |
| 2/1/17      |  115    |  1   |
| 2/1/17      |  110    |  2   |
+-------------+---------+------+
+---------------+---------------+--------------+------------+-------------+
| New Dataframe |               |              |            |             |
+---------------+---------------+--------------+------------+-------------+
| Date          |  Sales loc1   |  Sales loc2  |  exp loc1  |  exp loc2   |
| 1/1/17        |  100          |  95          |  100       |  125        |
| 1/2/17        |  125          |  124         |  100       |  125        |
| 1/3/17        |  115          |  152         |  100       |  125        |
| ...           |               |              |            |             |
| 2/1/17        |  110          |  111         |  115       |  110        |
+---------------+---------------+--------------+------------+-------------+
The values from df2 will be used for multiple cells till there is a new value in df2.
Thanks a lot for your time.
A generalised solution where there can be any number of rows for the same date in Date would involve, 
df1 and df2 using merge
groupby + apply to flatten the dataframerename and add_prefix
v = df1.merge(df2[['Date', 'exp']])\
       .groupby(df1.columns.tolist())\
       .exp\
       .apply(pd.Series.tolist)
df = pd.DataFrame(v.tolist(), index=v.index)\
       .rename(columns=lambda x: x + 1)\
       .add_prefix('exp loc')\
       .reset_index()
df
     Date  Sales loc1  Sales loc2  exp loc1  exp loc2
0  1/1/17         100          95       100       125
1  2/1/17         110         111       115       110
Here's another solution that should work nicely if you only have two (or, in general, exactly N) sets of rows per Date in df2.
n = 2
v = pd.DataFrame(
     df2.exp.values.reshape(-1, n), 
     index=df2.Date.unique(), 
     columns=range(1, n + 1)
).add_prefix('exp loc')\
 .rename_axis('Date')\
 .reset_index()
Now, it's just a simple merge with df1 on Date.
df1.merge(v, on='Date')
     Date  Sales loc1  Sales loc2  exp loc1  exp loc2
0  1/1/17         100          95       100       125
1  2/1/17         110         111       115       110
Or, as @A. Leistra pointed out, you might want a different sort of result using a left outer merge:
df1.merge(v, how='left', on='Date').ffill()
     Date  Sales loc1  Sales loc2  exp loc1  exp loc2
0  1/1/17         100          95     100.0     125.0
1  1/2/17         125         124     100.0     125.0
2  1/3/17         115         152     100.0     125.0
3  2/1/17         110         111     115.0     110.0
                        If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With