Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I add a new computed column in a dataframe? [duplicate]

Tags:

python

pandas

I'm trying to compute the age of a person from the data that I have:

Data columns in 'Person' Dataframe:
TodaysDate   non-null datetime64[ns]
YOB          non-null float64

So I want to make a new column inside that dataframe called 'Age' and so far I have the following code:

Person['Age'] = map(sum, (Person.ix[0,'TodaysDate']).year, -(Person['YOB']))

TypeError: 'int' object is not iterable

I've also tried:

Person['Age'] = map((Person.ix[0,'TodaysDate']).year - Person['YOB'])

TypeError: map() must have at least two arguments.

I've tried a few different methods that were posted on other questions but none seem to work. This seems very simple to do...but can't get it to work.

Any ideas how I can use the map function to subtract the datetime column TodaysDate from the float column YOB to and put the value into Age column? I'd like to do this for every row in the dataframe.

Thank you!

like image 444
MB41 Avatar asked Mar 14 '17 20:03

MB41


2 Answers

This answer is mostly just propaganda for assign. I'm a fan of assign because it returns a new pd.DataFrame that is a copy of the old pd.DataFrame with the additional columns included. In some contexts, returning a new pd.DataFrame is more appropriate. I feel that the syntax is clean and intuitive.

Also, note that I have added zero value in regards to the calculation as I've completely ripped off @MaxU's answer.

df.assign(Age=pd.datetime.now().year - df.YOB)

    YOB  Age
0  1955   62
1  1965   52
2  1975   42
3  1985   32
like image 58
piRSquared Avatar answered Oct 22 '22 05:10

piRSquared


Data:

In [5]: df
Out[5]:
    YOB
0  1955
1  1965
2  1975
3  1985

you don't need an extra column TodaysDate - you can get it dynamically:

In [6]: df['Age'] = pd.datetime.now().year - df.YOB

In [7]: df
Out[7]:
    YOB  Age
0  1955   62
1  1965   52
2  1975   42
3  1985   32

Alternatively you can use DataFrame.eval() method:

In [16]: df
Out[16]:
    YOB
0  1955
1  1965
2  1975
3  1985

In [17]: df.eval("Age = @pd.datetime.now().year - YOB", inplace=True)

In [18]: df
Out[18]:
    YOB  Age
0  1955   62
1  1965   52
2  1975   42
3  1985   32
like image 21
MaxU - stop WAR against UA Avatar answered Oct 22 '22 03:10

MaxU - stop WAR against UA