Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Key error on pandas merge (left join)

I have two dataframes below, df_purchase(1) and df_login(2)

+--------+-----+--------+------------+--------------------+-------------+--------------------------+
|        | age | gender |    ttp     |       count        | sum(amount) |          region          |
+--------+-----+--------+------------+--------------------+-------------+--------------------------+
|  49427 | 63  | M      | 824.731412 | 2                  | 25.00       | Omaha, Nebraska          |
|  28433 | 49  | M      | 1.166250   | 2                  | 41.94       | Catasauqua, Pennsylvania |
|   4162 | 29  | M      | 5.620949   | 2                  | 51.78       | Eagle Center, Iowa       |
|  18747 | 43  | M      | 153.502072 | 2                  | 23.84       | Pacific, Washington      |
|  45173 | 59  | M      | 0.027257   | 2                  | 13.98       | De Soto, Missouri        |
+--------+-----+--------+------------+--------------------+-------------+--------------------------+

+--------+-----+--------+------------+--------------------+-------------+--------------------------+
|        | age | gender | count      | region             |             |                          |
| 671766 | 84  | M      | 13900      | New York, New York |             |                          |
| 671166 | 84  | F      | 7619       | New York, New York |             |                          |
| 672209 | 85  | F      | 6483       | New York, New York |             |                          |
| 672671 | 85  | M      | 5808       | New York, New York |             |                          |
| 195201 | 34  | M      | 3817       | New York, New York |             |                          |
+--------+-----+--------+------------+--------------------+-------------+--------------------------+

I am trying to join df_logins to df_purchase on age, gender and region with the following pandas code:

df = pd.merge(df_purchase, df_login[['count']],
                       how='left', on=['age', 'gender', 'region'])

However, I keep getting this error: KeyError: 'age' Any thoughts?

like image 316
metersk Avatar asked Mar 24 '15 20:03

metersk


People also ask

What is a left merge in Pandas?

A left join, or left merge, keeps every row from the left dataframe. Result from left-join or left-merge of two dataframes in Pandas. Rows in the left dataframe that have no corresponding join value in the right dataframe are left with NaN values.

What is difference between merge and join in Pandas?

Both join and merge can be used to combines two dataframes but the join method combines two dataframes on the basis of their indexes whereas the merge method is more versatile and allows us to specify columns beside the index to join on for both dataframes.

How do I merge two Dataframes in Pandas?

The concat() function can be used to concatenate two Dataframes by adding the rows of one to the other. The merge() function is equivalent to the SQL JOIN clause. 'left', 'right' and 'inner' joins are all possible.

How do I fix Pandas key error?

How to Fix the KeyError? We can simply fix the error by correcting the spelling of the key. If we are not sure about the spelling we can simply print the list of all column names and crosscheck.


1 Answers

The KeyError arises from this:

df = pd.merge(df_purchase, df_login[['count']] <- this selects just count column,
                       how='left', on=['age', 'gender', 'region'])

You've specifically selected just a single column from df_login, you need this:

df = pd.merge(df_purchase, df_login,
                       how='left', on=['age', 'gender', 'region'])

I'm assuming that this is not your complete data as you have no common values in the age and region column in df_login.

like image 110
EdChum Avatar answered Oct 16 '22 23:10

EdChum