Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Adding a new pandas column with mapped value from a dictionary [duplicate]

Tags:

python

pandas

I'm trying do something that should be really simple in pandas, but it seems anything but. I'm trying to add a column to an existing pandas dataframe that is a mapped value based on another (existing) column. Here is a small test case:

import pandas as pd
equiv = {7001:1, 8001:2, 9001:3}
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
df["B"] = equiv(df["A"])
print(df)

I was hoping the following would result:

      A   B
0  7001   1
1  8001   2
2  9001   3

Instead, I get an error telling me that equiv is not a callable function. Fair enough, it's a dictionary, but even if I wrap it in a function I still get frustration. So I tried to use a map function that seems to work with other operations, but it also is defeated by use of a dictionary:

df["B"] = df["A"].map(lambda x:equiv[x])

In this case I just get KeyError: 8001. I've read through documentation and previous posts, but have yet to come across anything that suggests how to mix dictionaries with pandas dataframes. Any suggestions would be greatly appreciated.

like image 570
Rick Donnelly Avatar asked Jun 14 '14 03:06

Rick Donnelly


People also ask

How do you remap values in pandas DataFrame column with a dictionary and preserve NANS?

use df. replace({"Duration": dict_duration},inplace=True) to remap none or NaN values in pandas DataFrame with Dictionary values. To remap None / NaN values of the 'Duration ' column by their respective codes using the df. replace() function.

Can you create a series from the dictionary object in pandas?

You can create a pandas series from a dictionary by passing the dictionary to the command: pandas. Series() . In this article, you will learn about the different methods of configuring the pandas.

How do you turn a dictionary into a data frame?

You can convert a dictionary to Pandas Dataframe using df = pd. DataFrame. from_dict(my_dict) statement.


Video Answer


1 Answers

The right way of doing it will be df["B"] = df["A"].map(equiv).

In [55]:

import pandas as pd
equiv = {7001:1, 8001:2, 9001:3}
df = pd.DataFrame( {"A": [7001, 8001, 9001]} )
df["B"] = df["A"].map(equiv)
print(df)
      A  B
0  7001  1
1  8001  2
2  9001  3

[3 rows x 2 columns]

And it will handle the situation when the key does not exist very nicely, considering the following example:

In [56]:

import pandas as pd
equiv = {7001:1, 8001:2, 9001:3}
df = pd.DataFrame( {"A": [7001, 8001, 9001, 10000]} )
df["B"] = df["A"].map(equiv)
print(df)
       A   B
0   7001   1
1   8001   2
2   9001   3
3  10000 NaN

[4 rows x 2 columns]
like image 71
CT Zhu Avatar answered Oct 17 '22 15:10

CT Zhu