Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas map column in place

I've spent some time googling and didn't find answer to the simple question: how can I map column of Pandas dataframe in-place? Say, I have the following df:

In [67]: frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])

In [68]: frame
Out[68]: 
               b         d         e
Utah   -1.240032  1.586191 -1.272617
Ohio   -0.161516 -2.169133  0.223268
Texas  -1.921675  0.246167 -0.744242
Oregon  0.371843  2.346133  2.083234

And I want to add 1 to each value of b column. I know that I can do that like that:

In [69]: frame['b'] = frame['b'].map(lambda x: x + 1)

Or like that -- AFAIK there is no difference between map and apply in context of Series (except that map can also accept dict or Series) -- correct me if I'm wrong:

In [71]: frame['b'] = frame['b'].apply(lambda x: x + 1)

But I don't like specifying 'b' twice. Instead, I would like to do something like that:

frame['b'].map(lambda x: x + 1, inplace=True)

Is it possible?

like image 796
ars Avatar asked Mar 16 '16 07:03

ars


People also ask

What pandas method will you use to map columns between two data frames?

pandas. map() is used to map values from two series having one column same. For mapping two series, the last column of the first series should be same as index column of the second series, also the values should be unique.

How do you remap values in pandas DataFrame column with a dictionary and preserve NANS?

use df. replace({"Duration": dict_duration},inplace=True) to remap none or NaN values in pandas DataFrame with Dictionary values. To remap None / NaN values of the 'Duration ' column by their respective codes using the df. replace() function.

How do you put a map on a data frame?

DataFrame - applymap() functionThe applymap() function is used to apply a function to a Dataframe elementwise. This method applies a function that accepts and returns a scalar to every element of a DataFrame. Python function, returns a single value from a single value. Transformed DataFrame.


2 Answers

frame
Out[6]: 
               b         d         e
Utah   -0.764764  0.663018 -1.806592
Ohio    0.082226 -0.164653 -0.744252
Texas   0.763119  1.492637 -1.434447
Oregon -0.485245 -0.806335 -0.008397

frame['b'] +=1

frame
Out[8]: 
               b         d         e    
Utah    0.235236  0.663018 -1.806592
Ohio    1.082226 -0.164653 -0.744252
Texas   1.763119  1.492637 -1.434447
Oregon  0.514755 -0.806335 -0.008397

Edit to add:

If this is an arbitary function, and you really need to apply in place, you can write a thin wrapper around pandas to handle it. Personally I can't imagine a time when it would be that critical that you need to not use the standard implementation (unless perhaps you write a tonne of code and can't be bother to write the extra charecters perhaps??)

from pandas import DataFrame
import numpy as np

class MyWrapper(DataFrame):
    def __init__(self, *args, **kwargs):
        super(MyWrapper,self).__init__(*args,**kwargs)

    def myapply(self,label, func):
        self[label]= super(MyWrapper,self).__getitem__(label).apply(func)


df =  frame = MyWrapper(np.random.randn(4, 3), columns=list('bde'), index=['Utah', 'Ohio', 'Texas', 'Oregon'])
print df
df.myapply('b', lambda x: x+1)
print df

Gives:

>>   
               b         d         e
Utah   -0.260549 -0.981025  1.136154
Ohio    0.073732 -0.895937 -0.025134
Texas   0.555507 -1.173679  0.946342
Oregon  1.871728 -0.850992  1.135784
               b         d         e
Utah    0.739451 -0.981025  1.136154
Ohio    1.073732 -0.895937 -0.025134
Texas   1.555507 -1.173679  0.946342
Oregon  2.871728 -0.850992  1.135784

Obviously this is a very minimal example, hopefully which exposes a few methods of interest for you.

like image 140
Chris Avatar answered Sep 21 '22 08:09

Chris


You can use add

In [2]: import pandas as pd

In [3]: import numpy as np

In [4]: frame = pd.DataFrame(np.random.randn(4, 3), columns=list('bde'), index=
   ...: ['Utah', 'Ohio', 'Texas', 'Oregon'])

In [5]: frame.head()
Out[5]:
               b         d         e
Utah   -1.165332 -0.999244 -0.541742
Ohio   -0.319887  0.199094 -0.438669
Texas  -1.242524 -0.385092 -0.389616
Oregon  0.331593  0.505496  1.688962

In [6]: frame.b.add(1)
Out[6]:
Utah     -0.165332
Ohio      0.680113
Texas    -0.242524
Oregon    1.331593
Name: b, dtype: float64

In [7]:
like image 38
Moondra Avatar answered Sep 20 '22 08:09

Moondra