Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Update values of a column

I have a large dataframe with multiple columns (sample shown below). I want to update the values of one particular (population column) column by dividing the values of it by 1000.

City     Population
Paris    23456
Lisbon   123466
Madrid   1254
Pekin    86648

I have tried df['Population'].apply(lambda x: int(str(x))/1000)

and

df['Population'].apply(lambda x: int(x)/1000)

Both give me the error

ValueError: invalid literal for int() with base 10: '...'

like image 571
PyAnton Avatar asked Feb 07 '17 21:02

PyAnton


1 Answers

If your DataFrame really does look as presented, then the second example should work just fine (with the int not even being necessary):

In [16]: df
Out[16]: 
     City  Population
0   Paris       23456
1  Lisbon      123466
2  Madrid        1254
3   Pekin       86648

In [17]: df['Population'].apply(lambda x: x/1000)
Out[17]: 
0     23.456
1    123.466
2      1.254
3     86.648
Name: Population, dtype: float64

In [18]: df['Population']/1000
Out[18]: 
0     23.456
1    123.466
2      1.254
3     86.648

However, from the error, it seems like you have the unparsable string '...' somewhere in your Series, and that the data needs to be cleaned further.

like image 168
fuglede Avatar answered Oct 03 '22 20:10

fuglede