Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SettingWithCopyWarning even when using .loc[row_indexer,col_indexer] = value

Tags:

python

pandas

This is one of the lines in my code where I get the SettingWithCopyWarning:

value1['Total Population']=value1['Total Population'].replace(to_replace='*', value=4)

Which I then changed to :

row_index= value1['Total Population']=='*'
value1.loc[row_index,'Total Population'] = 4

This still gives the same warning. How do I get rid of it?

Also, I get the same warning for a convert_objects(convert_numeric=True) function that I've used, is there any way to avoid that.

 value1['Total Population'] = value1['Total Population'].astype(str).convert_objects(convert_numeric=True)

This is the warning message that I get:

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy 
like image 516
Pragnya Srinivasan Avatar asked Sep 14 '15 20:09

Pragnya Srinivasan


People also ask

How do you deal with SettingWithCopyWarning?

Generally, to avoid a SettingWithCopyWarning in Pandas, you should do the following: Avoid chained assignments that combine two or more indexing operations like df["z"][mask] = 0 and df. loc[mask]["z"] = 0 . Apply single assignments with just one indexing operation like df.

How do you hide a value is trying to be set on a copy of a slice from a DataFrame?

Using loc for slicing A value is trying to be set on a copy of a slice from a DataFrame. One approach that can be used to suppress SettingWithCopyWarning is to perform the chained operations into just a single loc operation. This will ensure that the assignment happens on the original DataFrame instead of a copy.

How do you value using ILOC?

iloc[] to Get a Cell Value by Column Position. If you wanted to get a cell value by column number or index position use DataFrame. iloc[] , index position starts from 0 to length-1 (index starts from zero). In order to refer last column use -1 as the column position.


1 Answers

If you use .loc[row,column] and still get the same error, it's probably because of copying another data frame. You have to use .copy().

This is a step by step error reproduction:

import pandas as pd

d = {'col1': [1, 2, 3, 4], 'col2': [3, 4, 5, 6]}
df = pd.DataFrame(data=d)
df
#   col1    col2
#0  1   3
#1  2   4
#2  3   5
#3  4   6

Creating a new column and updating its value:

df['new_column'] = None
df.loc[0, 'new_column'] = 100
df
#   col1    col2    new_column
#0  1   3   100
#1  2   4   None
#2  3   5   None
#3  4   6   None

No error I receive. However, let's create another data frame given the previous one:

new_df = df.loc[df.col1>2]
new_df
#col1   col2    new_column
#2  3   5   None
#3  4   6   None

Now, using .loc, I will try to replace some values in the same manner:

new_df.loc[2, 'new_column'] = 100

However, I got this hateful warning again:

A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy

SOLUTION

use .copy() while creating the new data frame will solve the warning:

new_df_copy = df.loc[df.col1>2].copy()
new_df_copy.loc[2, 'new_column'] = 100

Now, you won't receive any warnings!

If your data frame is created using a filter on top of another data frame, always use .copy().

like image 162
Hadij Avatar answered Oct 19 '22 18:10

Hadij