Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Edit pandas DataFrame using indexes

Tags:

python

pandas

Is there a general, efficient way to assign values to a subset of a DataFrame in pandas? I've got hundreds of rows and columns that I can access directly but I haven't managed to figure out how to edit their values without iterating through each row,col pair. For example:

In [1]: import pandas, numpy

In [2]: array = numpy.arange(30).reshape(3,10)

In [3]: df = pandas.DataFrame(array, index=list("ABC"))

In [4]: df
Out[4]: 
    0   1   2   3   4   5   6   7   8   9
A   0   1   2   3   4   5   6   7   8   9
B  10  11  12  13  14  15  16  17  18  19
C  20  21  22  23  24  25  26  27  28  29

In [5]: rows = ['A','C']

In [6]: columns = [1,4,7]

In [7]: df[columns].ix[rows]
Out[7]: 
    1   4   7
A   1   4   7
C  21  24  27

In [8]: df[columns].ix[rows] = 900

In [9]: df
Out[9]: 
    0   1   2   3   4   5   6   7   8   9
A   0   1   2   3   4   5   6   7   8   9
B  10  11  12  13  14  15  16  17  18  19
C  20  21  22  23  24  25  26  27  28  29

I believe what is happening here is that I'm getting a copy rather than a view, meaning I can't assign to the original DataFrame. Is that my problem? What's the most efficient way to edit those rows x columns (preferably in-pace, as the DataFrame may take up a lot of memory)?

Also, what if I want to replace those values with a correctly shaped DataFrame?

like image 764
Noah Avatar asked Jul 09 '13 20:07

Noah


People also ask

How do I edit a DataFrame index?

To set the DataFrame index using existing columns or arrays in Pandas, use the set_index() method. The set_index() function sets the DataFrame index using existing columns. The index can replace the existing index or expand on it.

How do you change the index of a DataFrame to a column in pandas?

To reset the index in pandas, you simply need to chain the function . reset_index() with the dataframe object. On applying the . reset_index() function, the index gets shifted to the dataframe as a separate column.

How do you access the index of a data frame?

Pandas DataFrame – Get Index To get the index of a Pandas DataFrame, call DataFrame. index property. The DataFrame. index property returns an Index object representing the index of this DataFrame.


1 Answers

Use loc in an assignment expression (the = means it's not relevant whether it is a view or a copy!):

In [11]: df.loc[rows, columns] = 99

In [12]: df
Out[12]:
    0   1   2   3   4   5   6   7   8   9
A   0  99   2   3  99   5   6  99   8   9
B  10  11  12  13  14  15  16  17  18  19
C  20  99  22  23  99  25  26  99  28  29

If you're using a version prior to 0.11 you can use .ix.

As @Jeff comments:

This is an assignment expression (see 'advanced indexing with ix' section of the docs) and doesn't return anything (although there are assignment expressions which do return things, e.g. .at and .iat).

df.loc[rows,columns] can return a view, but usually it's a copy. Confusing, but done for efficiency.

Bottom line: use ix, loc, iloc to set (as above), and don't modify copies.

See 'view versus copy' section of the docs.

like image 199
Andy Hayden Avatar answered Oct 04 '22 15:10

Andy Hayden