Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Replacing a column with a function of itself in Pandas?

Tags:

I'm currently lost deep inside the pandas documentation. My problem is this:

I have a simple dataframe

col1  col2
 1     A
 4     B 
 5     X   

My aim is to apply something like:

 df['col1'] = df['col1'].apply(square)

where square is a cleanly defined function. But this operation throws an error warning (and produces incorrect results)

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

I can't make sense of this nor the documentation it points to. My workflow is linear (in case this makes a wider range of solutions viable).

Pandas 0.17.1 and Python 2.7

All help much appreciated.

like image 974
draco_alpine Avatar asked Jun 30 '16 09:06

draco_alpine


1 Answers

it works properly for me (pandas 0.18.1):

In [31]: def square(x):
   ....:     return x ** 2
   ....:

In [33]: df
Out[33]:
   col1 col2
0     1    A
1     4    B
2     5    X

In [35]: df.col1 = df.col1.apply(square)

In [36]: df
Out[36]:
   col1 col2
0     1    A
1    16    B
2    25    X

PS it also might depend on the implementation of your function...

like image 137
MaxU - stop WAR against UA Avatar answered Oct 11 '22 15:10

MaxU - stop WAR against UA