Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Set value for particular cell in pandas DataFrame using index

I have created a Pandas DataFrame

df = DataFrame(index=['A','B','C'], columns=['x','y']) 

and have got this

     x    y A  NaN  NaN B  NaN  NaN C  NaN  NaN 

Now, I would like to assign a value to particular cell, for example to row C and column x. I would expect to get this result:

     x    y A  NaN  NaN B  NaN  NaN C  10  NaN 

with this code:

df.xs('C')['x'] = 10 

However, the contents of df has not changed. The dataframe contains yet again only NaNs.

Any suggestions?

like image 552
Mitkp Avatar asked Dec 12 '12 14:12

Mitkp


People also ask

How do you change a value in a DataFrame index?

You can easily replace a value in pandas data frames by just specifying its column and its index. Having the dataframe above, we will replace some of its values. We are using the loc function of pandas. The first variable is the index of the value we want to replace and the second is its column.

How do I change a specific index in pandas?

Pandas DataFrame: set_index() function The set_index() function is used to set the DataFrame index using existing columns. Set the DataFrame index (row labels) using one or more existing columns or arrays of the correct length. The index can replace the existing index or expand on it.


1 Answers

RukTech's answer, df.set_value('C', 'x', 10), is far and away faster than the options I've suggested below. However, it has been slated for deprecation.

Going forward, the recommended method is .iat/.at.


Why df.xs('C')['x']=10 does not work:

df.xs('C') by default, returns a new dataframe with a copy of the data, so

df.xs('C')['x']=10 

modifies this new dataframe only.

df['x'] returns a view of the df dataframe, so

df['x']['C'] = 10 

modifies df itself.

Warning: It is sometimes difficult to predict if an operation returns a copy or a view. For this reason the docs recommend avoiding assignments with "chained indexing".


So the recommended alternative is

df.at['C', 'x'] = 10 

which does modify df.


In [18]: %timeit df.set_value('C', 'x', 10) 100000 loops, best of 3: 2.9 µs per loop  In [20]: %timeit df['x']['C'] = 10 100000 loops, best of 3: 6.31 µs per loop  In [81]: %timeit df.at['C', 'x'] = 10 100000 loops, best of 3: 9.2 µs per loop 
like image 132
unutbu Avatar answered Sep 19 '22 17:09

unutbu