Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Python 'map' function inserting NaN, possible to return original values instead?

I am passing a dictionary to the map function to recode values in the column of a Pandas dataframe. However, I noticed that if there is a value in the original series that is not explicitly in the dictionary, it gets recoded to NaN. Here is a simple example:

Typing...

s = pd.Series(['one','two','three','four']) 

...creates the series

0      one 1      two 2    three 3     four dtype: object 

But applying the map...

recodes = {'one':'A', 'two':'B', 'three':'C'} s.map(recodes) 

...returns the series

0      A 1      B 2      C 3    NaN dtype: object 

I would prefer that if any element in series s is not in the recodes dictionary, it remains unchanged. That is, I would prefer to return the series below (with the original four instead of NaN).

0      A 1      B 2      C 3   four dtype: object 

Is there an easy way to do this, for example an option to pass to the map function? The challenge I am having is that I can't always anticipate all possible values that will be in the series I'm recoding - the data will be updated in the future and new values could appear.

Thanks!

like image 684
atkat12 Avatar asked Feb 23 '16 22:02

atkat12


People also ask

Does map change the original array Python?

map() does not execute the function for empty elements. map() does not change the original array.

How do you replace NaN with mean value?

For mean, use the mean() function. Calculate the mean for the column with NaN and use the fillna() to fill the NaN values with the mean.


2 Answers

Use replace instead of map:

>>> s = pd.Series(['one','two','three','four']) >>> recodes = {'one':'A', 'two':'B', 'three':'C'} >>> s.map(recodes) 0      A 1      B 2      C 3    NaN dtype: object >>> s.replace(recodes) 0       A 1       B 2       C 3    four dtype: object 
like image 62
DSM Avatar answered Sep 29 '22 09:09

DSM


If you still want to use map the map function (can be faster than replace in some cases), you can define missing values:

class MyDict(dict): def __missing__(self, key):     return key  s = pd.Series(['one', 'two', 'three', 'four'])  recodes = MyDict({ 'one':'A', 'two':'B', 'three':'C' })  s.map(recodes)  0       A 1       B 2       C 3    four dtype: object 
like image 29
gio_geh Avatar answered Sep 29 '22 07:09

gio_geh