Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas chaining and the use of "inplace" parameter

For pandas DataFrames in python, multiple member methods have an inplace parameter which purportedly allow you to NOT create a copy of the object, but rather to directly modify the original object*.

[*Edited to add: however, this proves to not be the case as pointed out by @juanpa.arrivillaga. inplace=True DOES copy data and merely updates a pointer associated with the modified object, so has few advantages over a manual re-assignment to the name of the original object.]

Examples that I have seen online for the use of inplace=True do not include examples where chaining is used. This comment in a related SO thread may be an answer to why I don't see such examples anywhere:

you can't method chain and operate in-place. in-place ops return None and break the chain

But, would "inplace chaining" work if you put an inplace=True in the last entry in the chain? [Edited to add: no] Or would that be equivalent to trying to change a copy created in an earlier link in the chain, which, as it is no longer your original object, is "lost" after the chain statement is complete? [Edited to add: yes; see answer here]

The use of large data objects would seem to preclude the notion of chaining without the ability to do so in-place, at least insofar as desire to maintain a low memory overhead and high computational speed. Is there an alternate implementation of pandas or, e.g. an equivalent of R's data.table available in python that might be appropriate for my needs? Or are my only options to not chain (and compute quickly) or to chain but make redundant copies of the data, at least transiently?

like image 317
mpag Avatar asked Dec 21 '25 09:12

mpag


1 Answers

Let's try it.

import pandas as pd
import numpy as np

df = pd.DataFrame({'value' : [2, 2, 1, 1, 3, 4, 5, np.NaN]})

df.sort_values('value').drop_duplicates().dropna(inplace=True)

Expect:

   value
2    1.0
0    2.0
4    3.0
5    4.0
6    5.0

Result:

   value
0    2.0
1    2.0
2    1.0
3    1.0
4    3.0
5    4.0
6    5.0
7    NaN

Answer: No, inplace=True at the end of the chain does not modify the original dataframe.

like image 86
Stu Sztukowski Avatar answered Dec 24 '25 00:12

Stu Sztukowski



Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!