Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to deal with SettingWithCopyWarning in Pandas

Background

I just upgraded my Pandas from 0.11 to 0.13.0rc1. Now, the application is popping out many new warnings. One of them like this:

E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead   quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE 

I want to know what exactly it means? Do I need to change something?

How should I suspend the warning if I insist to use quote_df['TVol'] = quote_df['TVol']/TVOL_SCALE?

The function that gives errors

def _decode_stock_quote(list_of_150_stk_str):     """decode the webpage and return dataframe"""      from cStringIO import StringIO      str_of_all = "".join(list_of_150_stk_str)      quote_df = pd.read_csv(StringIO(str_of_all), sep=',', names=list('ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefg')) #dtype={'A': object, 'B': object, 'C': np.float64}     quote_df.rename(columns={'A':'STK', 'B':'TOpen', 'C':'TPCLOSE', 'D':'TPrice', 'E':'THigh', 'F':'TLow', 'I':'TVol', 'J':'TAmt', 'e':'TDate', 'f':'TTime'}, inplace=True)     quote_df = quote_df.ix[:,[0,3,2,1,4,5,8,9,30,31]]     quote_df['TClose'] = quote_df['TPrice']     quote_df['RT']     = 100 * (quote_df['TPrice']/quote_df['TPCLOSE'] - 1)     quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE     quote_df['TAmt']   = quote_df['TAmt']/TAMT_SCALE     quote_df['STK_ID'] = quote_df['STK'].str.slice(13,19)     quote_df['STK_Name'] = quote_df['STK'].str.slice(21,30)#.decode('gb2312')     quote_df['TDate']  = quote_df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10])          return quote_df 

More error messages

E:\FinReporter\FM_EXT.py:449: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead   quote_df['TVol']   = quote_df['TVol']/TVOL_SCALE E:\FinReporter\FM_EXT.py:450: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead   quote_df['TAmt']   = quote_df['TAmt']/TAMT_SCALE E:\FinReporter\FM_EXT.py:453: SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_index,col_indexer] = value instead   quote_df['TDate']  = quote_df.TDate.map(lambda x: x[0:4]+x[5:7]+x[8:10]) 
like image 517
bigbug Avatar asked Dec 17 '13 03:12

bigbug


People also ask

What is setting with copy warning?

If you have ever done data analysis or manipulation with Pandas, it is highly likely that you encounter the SettingWithCopy warning at least once. This warning occurs when we try to do an assignment using chained indexing because chained indexing has inherently unpredictable results.

How do I get rid of delimiter in pandas?

Remove delimiter using split and str The str. split() function will give us a list of strings. The str[0] will allow us to grab the first element of the list. The assignment operator will allow us to update the existing column.


1 Answers

The SettingWithCopyWarning was created to flag potentially confusing "chained" assignments, such as the following, which does not always work as expected, particularly when the first selection returns a copy. [see GH5390 and GH5597 for background discussion.]

df[df['A'] > 2]['B'] = new_val  # new_val not set in df 

The warning offers a suggestion to rewrite as follows:

df.loc[df['A'] > 2, 'B'] = new_val 

However, this doesn't fit your usage, which is equivalent to:

df = df[df['A'] > 2] df['B'] = new_val 

While it's clear that you don't care about writes making it back to the original frame (since you are overwriting the reference to it), unfortunately this pattern cannot be differentiated from the first chained assignment example. Hence the (false positive) warning. The potential for false positives is addressed in the docs on indexing, if you'd like to read further. You can safely disable this new warning with the following assignment.

import pandas as pd pd.options.mode.chained_assignment = None  # default='warn' 

Other Resources

  • pandas User Guide: Indexing and selecting data
  • Python Data Science Handbook: Data Indexing and Selection
  • Real Python: SettingWithCopyWarning in Pandas: Views vs Copies
  • Dataquest: SettingwithCopyWarning: How to Fix This Warning in Pandas
  • Towards Data Science: Explaining the SettingWithCopyWarning in pandas
like image 55
Garrett Avatar answered Sep 26 '22 06:09

Garrett