Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Correct way to set new column in pandas DataFrame to avoid SettingWithCopyWarning

Tags:

python

pandas

Trying to create a new column in the netc df but i get the warning

netc["DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM  C:\Anaconda\lib\site-packages\ipykernel\__main__.py:1: SettingWithCopyWarning:  A value is trying to be set on a copy of a slice from a DataFrame. Try using .loc[row_indexer,col_indexer] = value instead 

whats the proper way to create a field in the newer version of Pandas to avoid getting the warning?

pd.__version__ Out[45]: u'0.19.2+0.g825876c.dirty' 
like image 654
Daniel Avatar asked Feb 21 '17 23:02

Daniel


People also ask

How do I create a new column in pandas?

You can add the new column to a pandas DataFrame using a dictionary. The keys of the dictionary should be the values of the existing column and the values to those keys will be the values of the new column. After making the dictionary, pass its values as the new column to the DataFrame.

What is setting with copy warning?

Warnings should never be ignored. If you have ever done data analysis or manipulation with Pandas, it is highly likely that you encounter the SettingWithCopy warning at least once. This warning occurs when we try to do an assignment using chained indexing because chained indexing has inherently unpredictable results.

What is the most efficient way to loop through Dataframes with pandas?

Vectorization is always the first and best choice. You can convert the data frame to NumPy array or into dictionary format to speed up the iteration workflow. Iterating through the key-value pair of dictionaries comes out to be the fastest way with around 280x times speed up for 20 million records.


1 Answers

As it says in the error, try using .loc[row_indexer,col_indexer] to create the new column.

netc.loc[:,"DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM. 

Notes

By the Pandas Indexing Docs your code should work.

netc["DeltaAMPP"] = netc.LOAD_AM - netc.VPP12_AM 

gets translated to

netc.__setitem__('DeltaAMPP', netc.LOAD_AM - netc.VPP12_AM) 

Which should have predictable behaviour. The SettingWithCopyWarning is only there to warn users of unexpected behaviour during chained assignment (which is not what you're doing). However, as mentioned in the docs,

Sometimes a SettingWithCopy warning will arise at times when there’s no obvious chained indexing going on. These are the bugs that SettingWithCopy is designed to catch! Pandas is probably trying to warn you that you’ve done this:

The docs then go on to give an example of when one might get that error even when it's not expected. So I can't tell why that's happening without more context.

like image 149
Filip Kilibarda Avatar answered Sep 18 '22 21:09

Filip Kilibarda