Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas warning when using map: A value is trying to be set on a copy of a slice from a DataFrame

Tags:

python

pandas

I've got the following code and it works. This basically renames values in columns so that they can be later merged.

pop = pd.read_csv('population.csv')
pop_recent = pop[pop['Year'] == 2014]

mapping = {
        'Korea, Rep.': 'South Korea',
        'Taiwan, China': 'Taiwan'
}
f= lambda x: mapping.get(x, x)
pop_recent['Country Name'] = pop_recent['Country Name'].map(f)

Warning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy pop_recent['Country Name'] = pop_recent['Country Name'].map(f)

I did google this! But no examples seem to be using map, so I'm at a loss...

like image 868
Mike Avatar asked Oct 19 '15 13:10

Mike


People also ask

How do you ignore a value is trying to be set on a copy of a slice from a DataFrame?

A value is trying to be set on a copy of a slice from a DataFrame. One approach that can be used to suppress SettingWithCopyWarning is to perform the chained operations into just a single loc operation. This will ensure that the assignment happens on the original DataFrame instead of a copy.

How do I turn off warnings in pandas?

How do I stop deprecation warning in Python? Use warnings. filterwarnings() to ignore deprecation warnings Call warnings. filterwarnings(action, category=DeprecationWarning) with action as "ignore" and category set to DeprecationWarning to ignore any deprecation warnings that may rise.

How do I fix pandas key error?

How to Fix the KeyError? We can simply fix the error by correcting the spelling of the key. If we are not sure about the spelling we can simply print the list of all column names and crosscheck.

What is SettingWithCopyWarning?

A SettingWithCopyWarning warns the user of a potential bug and should never be ignored even if the program runs as expected. The warning arises when a line of code both gets an item and sets an item. Pandas does not assure whether the get item returns a view or a copy of the dataframe.


1 Answers

The issue is with chained indexing , what you are actually trying to do is to set values to - pop[pop['Year'] == 2014]['Country Name'] - this would not work most of the times (as explained very well in the linked documentation) as this is two different calls and one of the calls may return a copy of the dataframe (I believe the boolean indexing) is returning the copy of the dataframe).

Hence, when you try to set values to that copy, it does not reflect in the original dataframe. Example -

In [6]: df
Out[6]:
   A  B
0  1  2
1  3  4
2  4  5
3  6  7
4  8  9

In [7]: df[df['A']==1]['B'] = 10
/path/to/ipython-script.py:1: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  if __name__ == '__main__':

In [8]: df
Out[8]:
   A  B
0  1  2
1  3  4
2  4  5
3  6  7
4  8  9

As noted , instead of chained indexing you should use DataFrame.loc to index the rows as well as the columns to update in a single call, avoiding this error. Example -

pop.loc[(pop['year'] == 2014), 'Country Name'] = pop.loc[(pop['year'] == 2014), 'Country Name'].map(f)

Or if this seem too long to you, you can create a mask (boolean dataframe) beforehand and assign to a variable, and use that in the above statement. Example -

mask = pop['year'] == 2014
pop.loc[mask,'Country Name'] = pop.loc[mask,'Country Name'].map(f)

Demo -

In [9]: df
Out[9]:
   A  B
0  1  2
1  3  4
2  4  5
3  6  7
4  8  9

In [10]: mapping = { 1:2 , 3:4}

In [11]: f= lambda x: mapping.get(x, x)

In [12]: df.loc[(df['B']==2),'A'] = df.loc[(df['B']==2),'A'].map(f)

In [13]: df
Out[13]:
   A  B
0  2  2
1  3  4
2  4  5
3  6  7
4  8  9

Demo with the mask method -

In [18]: df
Out[18]:
   A  B
0  1  2
1  3  4
2  4  5
3  6  7
4  8  9

In [19]: mask = df['B']==2

In [20]: df.loc[mask,'A'] = df.loc[mask,'A'].map(f)

In [21]: df
Out[21]:
   A  B
0  2  2
1  3  4
2  4  5
3  6  7
4  8  9
like image 191
Anand S Kumar Avatar answered Nov 15 '22 14:11

Anand S Kumar