How to self-reference column in pandas Data Frame?

Tags:

In Python's Pandas, I am using the Data Frame as such:

drinks = pandas.read_csv(data_url)

Where data_url is a string URL to a CSV file

When indexing the frame for all "light drinkers" where light drinkers is constituted by 1 drink, the following is written:

drinks.light_drinker[drinks.light_drinker == 1]

Is there a more DRY-like way to self-reference the "parent"? I.e. something like:

drinks.light_drinker[self == 1]

894

asked Jan 23 '15 00:01

3 Answers

You can now use query or assign depending on what you need:

drinks.query('light_drinker == 1')

or to mutate the the df:

df.assign(strong_drinker = lambda x: x.light_drinker + 100)

Old answer

Not at the moment, but an enhancement with your ideas is being discussed here. For simple cases where might be enough. The new API might look like this:

df.set(new_column=lambda self: self.light_drinker*2)

145

answered Oct 23 '22 09:10

elyase

In the most current version of pandas, .where() also accepts a callable!

http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.where.html?highlight=where#pandas.DataFrame.where

So, the following is now possible:

drinks.light_drinker.where(lambda x: x == 1)

which is particularly useful in method-chains. However, this will return only the Series (not the DataFrame filtered based on the values in the light_drinker column). This is consistent with your question, but I will elaborate for the other case.

To get a filtered DataFrame, use:

drinks.where(lambda x: x.light_drinker == 1)

Note that this will keep the shape of the self (meaning you will have rows where all entries will be NaN, because the condition failed for the light_drinker value at that index).

If you don't want to preserve the shape of the DataFrame (i.e you wish to drop the NaN rows), use:

drinks.query('light_drinker == 1')

Note that the items in DataFrame.index and DataFrame.columns are placed in the query namespace by default, meaning that you don't have to reference the self.

answered Oct 23 '22 09:10

WindChimes

I don't know of any way to reference parent objects like self or this in Pandas, but perhaps another way of doing what you want which could be considered more DRY is where().

drinks.where(drinks.light_drinker == 1, inplace=True)

answered Oct 23 '22 11:10

alacy

Related questions
                            
                                How to remove a module using Anaconda in Python
                            
                                How to apply the output of numpy.argpartition for 2-D Arrays?
                            
                                Select2 field implementation in flask/flask-admin
                            
                                django admin list_filter "or" condition
                            
                                How can I access a low-level client from a Boto 3 resource instance?
                            
                                Django 1.7 multisite User model
                            
                                Pandas: plotting two histograms on the same plot
                            
                                execute robot keyword from python using robotframework api
                            
                                ImportError: No module named pynotify. While the module is installed
                            
                                Calculate confidence band of least-square fit
                            
                                How to retry celery task on hard timeout?
                            
                                Two variables with the same list have different IDs.....why is that?
                            
                                python ctypes structure wrong byte size
                            
                                Tkinter OptionMenu DisplayOptions and Assignment Values
                            
                                Tkinter's event_generate command ignored
                            
                                What is None doing in the code object's co_consts attribute?
                            
                                How do I plot a list of tuples with matplotlib?
                            
                                simple pivot table of pandas dataframe
                            
                                Change an attribute of a function inside its own body?
                            
                                Emacs - Running current file in Python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

How to self-reference column in pandas Data Frame?

Tags:

python

pandas

scipy

James Graham

People also ask

3 Answers

elyase

WindChimes

alacy

Recent Activity

Donate For Us