I'm currently facing a problem with method chaining in manipulating data frames in pandas, here is the structure of my data: <pre class="prettyprint"><code>import pandas as pd lst1 = range(100) lst2 = range(100) lst3 = range(100) df = pd.DataFrame( {'Frenquency': lst1, 'lst2Tite': lst2, 'lst3Tite': lst3 }) </code></pre> the question is get entries(rows) if the frequency is less than 6, but it needs to be done in method chaining. I know using a traditional way is easy, I could just do <code>df[df["Frenquency"]<6]</code> to get the answer. However, the question is about how to do it with method chaining, I tried something like <code>df.drop(lambda x:x.index if x["Frequency"] <6 else null)</code> but it raised an error <code>"[<function <lambda> at 0x7faf529d3510>] not contained in axis"</code> Could anyone share some light on this issue?

This is an old question but I will answer since there is no accepted answer for future reference. <code>df[df.apply(lambda x: True if (x.Frenquency) <6 else False,axis=1)]</code> explanation: This lambda function checks the frequency and if yes it assigns True otherwise False and that series of True and False used by df to index the true values only. Note the column name Frenquency is a typo but I kept as it is since the question was like so.

Or maybe this: <pre class="prettyprint"><code>df.drop(i for i in df.Frequency if i >= 6) </code></pre> Or use inplace: <pre class="prettyprint"><code>df.drop((i for i in df.Frequency if i >= 6), inplace=True) </code></pre>

For this sort of selection, you can maintain a fluent interface and use method-chaining by using the <code>query</code> method: <pre class="prettyprint"><code>>>> df.query('Frenquency < 6') Frenquency lst2Tite lst3Tite 0 0 0 0 1 1 1 1 2 2 2 2 3 3 3 3 4 4 4 4 5 5 5 5 >>> </code></pre> So something like: <pre class="prettyprint"><code>df.rename(<something>).query('Frenquency <6').assign(<something>) </code></pre> Or more concretely: <pre class="prettyprint"><code>>>> (df.rename(columns={'Frenquency':'F'}) ... .query('F < 6') ... .assign(FF=lambda x: x.F**2)) F lst2Tite lst3Tite FF 0 0 0 0 0 1 1 1 1 1 2 2 2 2 4 3 3 3 3 9 4 4 4 4 16 5 5 5 5 25 </code></pre>

Drop few rows of a pandas dataframe using lambda

Tags:

python

pandas

dataframe

lambda

I'm currently facing a problem with method chaining in manipulating data frames in pandas, here is the structure of my data:

import pandas as pd

lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
df = pd.DataFrame(
    {'Frenquency': lst1,
     'lst2Tite': lst2,
     'lst3Tite': lst3
    })

the question is get entries(rows) if the frequency is less than 6, but it needs to be done in method chaining.

I know using a traditional way is easy, I could just do

df[df["Frenquency"]<6]

to get the answer.

However, the question is about how to do it with method chaining, I tried something like

df.drop(lambda x:x.index if x["Frequency"] <6 else null)

but it raised an error "[<function <lambda> at 0x7faf529d3510>] not contained in axis"

Could anyone share some light on this issue?

240

asked Nov 23 '17 16:11

jinge zhang

3 Answers

This is an old question but I will answer since there is no accepted answer for future reference.

df[df.apply(lambda x: True if (x.Frenquency) <6 else False,axis=1)]

explanation: This lambda function checks the frequency and if yes it assigns True otherwise False and that series of True and False used by df to index the true values only. Note the column name Frenquency is a typo but I kept as it is since the question was like so.

152

answered Oct 22 '22 09:10

dougMcarthy

Or maybe this:

df.drop(i for i in df.Frequency if i >= 6)

Or use inplace:

df.drop((i for i in df.Frequency if i >= 6), inplace=True)

answered Oct 22 '22 10:10

Anton vBR

For this sort of selection, you can maintain a fluent interface and use method-chaining by using the query method:

>>> df.query('Frenquency < 6')
   Frenquency  lst2Tite  lst3Tite
0           0         0         0
1           1         1         1
2           2         2         2
3           3         3         3
4           4         4         4
5           5         5         5
>>>

So something like:

df.rename(<something>).query('Frenquency <6').assign(<something>)

Or more concretely:

>>> (df.rename(columns={'Frenquency':'F'})
...    .query('F < 6')
...    .assign(FF=lambda x: x.F**2))
   F  lst2Tite  lst3Tite  FF
0  0         0         0   0
1  1         1         1   1
2  2         2         2   4
3  3         3         3   9
4  4         4         4  16
5  5         5         5  25

answered Oct 22 '22 09:10

juanpa.arrivillaga

Related questions
                            
                                Multiprocessing output differs between Linux and Windows - Why?
                            
                                Limit child collections in initial query sqlalchemy
                            
                                What happens in numpy's log function? Are there ways to improve the performance?
                            
                                Python: append an original object vs append a copy of object [duplicate]
                            
                                No module named _gdal
                            
                                Tensorflow: simultaneous prediction on GPU and CPU
                            
                                Could not import runpy module
                            
                                How to make python portable?
                            
                                Indexing and searching related objects with haystack
                            
                                Pandas: Concatenating DataFrame with Sparse Matrix
                            
                                Why using Anaconda environments to install tensorflow on Windows?
                            
                                Scatter plot label overlaps - matplotlib
                            
                                Why don't the signals emit?
                            
                                Dictionary record is truncated when printing
                            
                                LSTM Autoencoder no progress when script is running on larger dataset
                            
                                Empty crossword solver in Python
                            
                                Can TensorFlow run with multiple CPUs (no GPUs)?
                            
                                Autobahn, leave onMessage block
                            
                                Why is dependency links in setup.py deprecated?
                            
                                pandas pivot changes dtype

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With