Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas map makes values NaN

I'm trying to map my values that i want to change. When i apply 'map' like this >> df[column].map(dictionary), the values that are not in the dictionary convert to NaN. I think the reason is that there are no matched values in the series, right? If so, nothing should be applied instead converting to NaN? How can i solve this problem using df.map() instead of df.replace()?

df1 = pd.Series(['a','b','c','d'])
df
0    a
1    b
2    c
3    d
dtype: object

mapping = {'a' : 0, 'b' : 1, 'c' : 2}
df1.map(mapping)
0    0.0
1    1.0
2    2.0
3    NaN
dtype: float64

or

df1 = pd.Series(['a','b','c','d'])
df
0    a
1    b
2    c
3    d
dtype: object

mapping = {'k' : 0, 'e' : 1, 'f' : 2}
df1.map(mapping)

0   NaN
1   NaN
2   NaN
3   NaN
dtype: float64
like image 724
justin_sakong Avatar asked Sep 08 '18 15:09

justin_sakong


People also ask

Why am I getting NaN in pandas?

In applied data science, you will usually have missing data. For example, an industrial application with sensors will have sensor data that is missing on certain days. You have a couple of alternatives to work with missing data.

How do I fix NaN in pandas?

If you want to treat the value as a missing value, you can use the replace() method to replace it with float('nan') , np. nan , and math. nan .

How does map work in pandas?

pandas map() function from Series is used to substitute each value in a Series with another value, that may be derived from a function, a dict or a Series . Since DataFrame columns are series, you can use map() to update the column and assign it back to the DataFrame.


3 Answers

If you insist on map pass a callable instead

df.map(lambda x: mapping.get(x,x))
like image 123
piRSquared Avatar answered Oct 03 '22 20:10

piRSquared


To change the default value, you could add a function (func, here):

mapping = {'k' : 0, 'e' : 1, 'f' : 2}
mapping.setdefault('Default', 'write watherver you want here')
def func(x, mapping):
    try:
        tmp=mapping[x]
        return(tmp)
    except:
        return('default value')
df1.map(lambda x: func(x, mapping))
like image 37
Nate Avatar answered Oct 03 '22 20:10

Nate


This behavior is intended. Since mapping can not be applied the value is NaN. In order to use mapping you have to create a specific value that does not change your data (if you do multiplication that would be 1, if you do addition then 0) and add that value to your mapping.

Alternatively you could replace all NaN values after you have done the mapping with a neutral value like 0.0.

Either way is much more work then to simply use replace.

like image 43
not_a_bot_no_really_82353 Avatar answered Oct 03 '22 20:10

not_a_bot_no_really_82353