Pandas: Index updating and changing value accessed by location

Tags:

I have two index-related questions on Python Pandas dataframes.

import pandas as pd
import numpy as np
df = pd.DataFrame({'id' : range(1,9),
                'B' : ['one', 'one', 'two', 'three',
                       'two', 'three', 'one', 'two'],
                'amount' : np.random.randn(8)})

df = df.ix[df.B != 'three'] # remove where B = three
df.index
>>  Int64Index([0, 1, 2, 4, 6, 7], dtype=int64) # the original index is preserved.

1) I do not understand why the indexing is not automatically updated after I modify the dataframe. Is there a way to automatically update the indexing while modifying a dataframe? If not, what is the most efficient manual way to do this?

2) I want to be able to set the B column of the 5th element of df to 'three'. But df.iloc[5]['B'] = 'three' does not do that. I checked on the manual but it does not cover how to change a specific cell value accessed by location.

If I were accessing by row name, I could do: df.loc[5,'B'] = 'three' but I don't know what the index access equivalent is.

P.S. Link1 and link2 are relevant answers to my second question. However, they do not answer my question.

463

asked Jan 08 '14 13:01

Zhubarb

2 Answers

1) I do not understand why the indexing is not automatically updated after I modify the dataframe.

If you want to reset the index after removing/adding rows you can do this:

df = df[df.B != 'three'] # remove where B = three
df.reset_index(drop=True)

       B    amount  id
0    one    -1.176137    1
1    one     0.434470    2
2    two    -0.887526    3
3    two     0.126969    5
4    one     0.090442    7
5    two    -1.511353    8

Indexes are meant to label/tag/id a row... so you might think about making your 'id' column the index, and then you'll appreciate that Pandas doesn't 'automatically update' the index when deleting rows.

df.set_index('id')

       B    amount
id      
1    one    -0.410671
2    one     0.092931
3    two    -0.100324
4    three   0.322580
5    two    -0.546932
6    three  -2.018198
7    one    -0.459551
8    two     1.254597

2) I want to be able to set the B column of the 5th element of df to 'three'. But df.iloc[5]['B'] = 'three' does not do that. I checked on the manual but it does not cover how to change a specific cell value accessed by location.

Jeff already answered this...

181

answered Oct 13 '22 08:10

Brian Wylie

In [5]: df = pd.DataFrame({'id' : range(1,9),
   ...:                 'B' : ['one', 'one', 'two', 'three',
   ...:                        'two', 'three', 'one', 'two'],
   ...:                 'amount' : np.random.randn(8)})

In [6]: df
Out[6]: 
       B    amount  id
0    one -1.236735   1
1    one -0.427070   2
2    two -2.330888   3
3  three -0.654062   4
4    two  0.587660   5
5  three -0.719589   6
6    one  0.860739   7
7    two -2.041390   8

[8 rows x 3 columns]

Your question 1) your code above is correct (see @Briford Wylie for resetting the index, which is what I think you want)

In [7]: df.ix[df.B!='three']
Out[7]: 
     B    amount  id
0  one -1.236735   1
1  one -0.427070   2
2  two -2.330888   3
4  two  0.587660   5
6  one  0.860739   7
7  two -2.041390   8

[6 rows x 3 columns]

In [8]: df = df.ix[df.B!='three']

In [9]: df.index
Out[9]: Int64Index([0, 1, 2, 4, 6, 7], dtype='int64')

In [10]: df.iloc[5]
Out[10]: 
B             two
amount   -2.04139
id              8
Name: 7, dtype: object

Question 2):

You are trying to set a copy; In 0.13 this will raise/warn. see here

In [11]: df.iloc[5]['B'] = 5
/usr/local/bin/ipython:1: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame.

In [24]: df.iloc[5,df.columns.get_indexer(['B'])] = 'foo'

In [25]: df
Out[25]: 
     B    amount  id
0  one -1.236735   1
1  one -0.427070   2
2  two -2.330888   3
4  two  0.587660   5
6  one  0.860739   7
7  foo -2.041390   8

[6 rows x 3 columns]

You can also do this. This is NOT setting a copy and since it selects a Series (that is what df['B'] is, then it CAN be set directly

In [30]: df['B'].iloc[5] = 5

In [31]: df
Out[31]: 
     B    amount  id
0  one -1.236735   1
1  one -0.427070   2
2  two -2.330888   3
4  two  0.587660   5
6  one  0.860739   7
7    5 -2.041390   8

[6 rows x 3 columns]

answered Oct 13 '22 06:10

Jeff

Related questions
                            
                                Python - Looping through a multidimensional dictionary [duplicate]
                            
                                How does Python implement the modulo operation?
                            
                                pandas: DataFrame.mean() very slow. How can I calculate means of columns faster?
                            
                                Force python to print entire number
                            
                                Tkinter Not Found
                            
                                Why does scipy.ndimage.io.imread return PngImageFile, not an array of values
                            
                                How to have unique emails with python social auth
                            
                                Python: What's the use case for set.pop()?
                            
                                how to set values to rows of boolean filtered dataframe column
                            
                                How is pandas deciding order in a sort when there is a tie?
                            
                                Removing empty elements from an array in Python
                            
                                amplitude of numpy's fft results is to be multiplied by sampling period?
                            
                                Finding 3d distances using an inbuilt function in python
                            
                                Converting string that looks like a list into a real list - python
                            
                                How to replace values in a numpy array based on another column?
                            
                                how to make a grouped boxplot graph in matplotlib
                            
                                AttributeError: 'module' object has no attribute 'lowercase'
                            
                                Why does pandas use (&, |) instead of the normal, pythonic (and, or)?
                            
                                Python: How to read a text file containing co-ordinates in row-column format into x-y co-ordinate arrays?
                            
                                Convert a IP to Hex by Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Pandas: Index updating and changing value accessed by location

Tags:

python

indexing

pandas

dataframe

Zhubarb

People also ask

2 Answers

Brian Wylie

Jeff

Recent Activity

Donate For Us