Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas: Chained assignments [duplicate]

I have been reading this link on "Returning a view versus a copy". I do not really get how the chained assignment concept in Pandas works and how the usage of .ix(), .iloc(), or .loc() affects it.

I get the SettingWithCopyWarning warnings for the following lines of codes, where data is a Panda dataframe and amount is a column (Series) name in that dataframe:

data['amount'] = data['amount'].astype(float)

data["amount"].fillna(data.groupby("num")["amount"].transform("mean"), inplace=True)

data["amount"].fillna(mean_avg, inplace=True)

Looking at this code, is it obvious that I am doing something suboptimal? If so, can you let me know the replacement code lines?

I am aware of the below warning and like to think that the warnings in my case are false positives:

The chained assignment warnings / exceptions are aiming to inform the user of a possibly invalid assignment. There may be false positives; situations where a chained assignment is inadvertantly reported.

EDIT : the code leading to the first copy warning error.

data['amount'] = data.apply(lambda row: function1(row,date,qty), axis=1) 
data['amount'] = data['amount'].astype(float)

def function1(row,date,qty):
    try:
        if(row['currency'] == 'A'):
            result = row[qty]
        else:
            rate = lookup[lookup['Date']==row[date]][row['currency'] ]
            result = float(rate) * float(row[qty])
        return result
    except ValueError: # generic exception clause
        print "The current row causes an exception:"
like image 456
Zhubarb Avatar asked Jan 30 '14 17:01

Zhubarb


People also ask

How do you avoid chained indexing in pandas?

Avoid chained assignments that combine two or more indexing operations like df["z"][mask] = 0 and df. loc[mask]["z"] = 0 . Apply single assignments with just one indexing operation like df. loc[mask, "z"] = 0 .

How do you ignore a value is trying to be set on a copy of a slice from a DataFrame?

A value is trying to be set on a copy of a slice from a DataFrame. One approach that can be used to suppress SettingWithCopyWarning is to perform the chained operations into just a single loc operation. This will ensure that the assignment happens on the original DataFrame instead of a copy.

Does pandas LOC create a copy?

All operations generate a copy.

What is SettingWithCopyWarning?

A SettingWithCopyWarning warns the user of a potential bug and should never be ignored even if the program runs as expected. The warning arises when a line of code both gets an item and sets an item. Pandas does not assure whether the get item returns a view or a copy of the dataframe.


1 Answers

The point of the SettingWithCopy is to warn the user that you may be doing something that will not update the original data frame as one might expect.

Here, data is a dataframe, possibly of a single dtype (or not). You are then taking a reference to this data['amount'] which is a Series, and updating it. This probably works in your case because you are returning the same dtype of data as existed.

However it could create a copy which updates a copy of data['amount'] which you would not see; Then you would be wondering why it is not updating.

Pandas returns a copy of an object in almost all method calls. The inplace operations are a convience operation which work, but in general are not clear that data is being modified and could potentially work on copies.

Much more clear to do this:

data['amount'] = data["amount"].fillna(data.groupby("num")["amount"].transform("mean"))

data["amount"] = data['amount'].fillna(mean_avg)

One further plus to working on copies. You can chain operations, this is not possible with inplace ones.

e.g.

data['amount'] = data['amount'].fillna(mean_avg)*2

And just an FYI. inplace operations are neither faster nor more memory efficient. my2c they should be banned. But too late on that API.

You can of course turn this off:

pd.set_option('chained_assignment',None)

Pandas runs with the entire test suite with this set to raise (so we know if chaining is happening) on, FYI.

like image 186
Jeff Avatar answered Sep 30 '22 08:09

Jeff