Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas DataFrame: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame [duplicate]

Tags:

python

pandas

I know there are tons of posts about this warning, but I couldn't find a solution to my situation. Here's my code:

df.loc[:, 'my_col'] = df.loc[:, 'my_col'].astype(int)
#df.loc[:, 'my_col'] = df.loc[:, 'my_col'].astype(int).copy()
#df.loc[:, 'my_col'] = df['my_col'].astype(int)

It produces the warning:

SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead

Even though I changed the code as suggested, I still get this warning? All I need to do is to convert the data type of one column.

**Remark: ** Originally the column is of type float having one decimal (example: 4711.0). Therefore I change it to integer (4711) and then to string ('4711') - just to remove the decimal.

Appreciate your help!

Update: The warning was a side effect on a filtering of the original data that was done just before. I was missing the DataFrame.copy(). Using the copy instead, solved the problem!

df = df[df['my_col'].notnull()].copy()
df.loc[:, 'my_col'] = df['my_col'].astype(int).astype(str)
#df['my_col'] = df['my_col'].astype(int).astype(str) # works too!
like image 777
Matthias Avatar asked Apr 09 '18 08:04

Matthias


People also ask

How do I get rid of SettingWithCopyWarning?

To make it clear you only want to assign a copy of the data (versus a view of the original slice) you can append . copy() to your request, e.g.

What is setting with copy warning?

Warnings should never be ignored. If you have ever done data analysis or manipulation with Pandas, it is highly likely that you encounter the SettingWithCopy warning at least once. This warning occurs when we try to do an assignment using chained indexing because chained indexing has inherently unpredictable results.

Why do we use copy () in Pandas?

The copy() method returns a copy of the DataFrame. By default, the copy is a "deep copy" meaning that any changes made in the original DataFrame will NOT be reflected in the copy.


3 Answers

I think need copy and omit loc for select columns:

df = df[df['my_col'].notnull()].copy() df['my_col'] = df['my_col'].astype(int).astype(str) 

Explanation:

If you modify values in df later you will find that the modifications do not propagate back to the original data (df), and that Pandas does warning.

like image 95
jezrael Avatar answered Sep 27 '22 20:09

jezrael


another way is to disable chained assignments, which works on your code without the need to create a copy:

# disable chained assignments pd.options.mode.chained_assignment = None  
like image 21
sudonym Avatar answered Sep 27 '22 20:09

sudonym


If you need to change the data type of a single column, it's easier to address that column directly:

df['my_col'] = df['my_col'].astype(int)

Or using .assign:

df = df.assign(my_col=lambda d: d['my_col'].astype(int))

The .assign is useful if you only need the conversion once, and don't want to alter your df outside of that scope.

like image 42
Guybrush Avatar answered Sep 27 '22 20:09

Guybrush