Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Drop few rows of a pandas dataframe using lambda

I'm currently facing a problem with method chaining in manipulating data frames in pandas, here is the structure of my data:

import pandas as pd

lst1 = range(100)
lst2 = range(100)
lst3 = range(100)
df = pd.DataFrame(
    {'Frenquency': lst1,
     'lst2Tite': lst2,
     'lst3Tite': lst3
    })

the question is get entries(rows) if the frequency is less than 6, but it needs to be done in method chaining.

I know using a traditional way is easy, I could just do

df[df["Frenquency"]<6]

to get the answer.

However, the question is about how to do it with method chaining, I tried something like

df.drop(lambda x:x.index if x["Frequency"] <6 else null)

but it raised an error "[<function <lambda> at 0x7faf529d3510>] not contained in axis"

Could anyone share some light on this issue?

like image 240
jinge zhang Avatar asked Nov 23 '17 16:11

jinge zhang


People also ask

How do you drop few rows in Pandas?

You can delete a list of rows from Pandas by passing the list of indices to the drop() method. In this code, [5,6] is the index of the rows you want to delete. axis=0 denotes that rows should be deleted from the dataframe.

How do you drop the first 5 rows in Pandas?

Remove First N Rows of Pandas DataFrame Using tail()tail(df. shape[0] -n) to remove the top/first n rows of pandas DataFrame. Generally, DataFrame. tail() function is used to show the last n rows of a pandas DataFrame but you can pass a negative value to skip the rows from the beginning.

How do you drop the first 3 rows in Pandas?

To delete the first three rows of a DataFrame in Pandas, we can use the iloc() method.

How do you drop the last 3 rows in Pandas?

Drop Last Row of Pandas DataFrame Using iloc[] iloc[] you can drop the rows from DataFrame and use -1 to drop the last row. For example use df. iloc[:-1,:] to select all rows except the last one and then assign it back to the original variable which ideally drops the last row from DataFrame. Yields below output.


3 Answers

This is an old question but I will answer since there is no accepted answer for future reference.

df[df.apply(lambda x: True if (x.Frenquency) <6 else False,axis=1)]

explanation: This lambda function checks the frequency and if yes it assigns True otherwise False and that series of True and False used by df to index the true values only. Note the column name Frenquency is a typo but I kept as it is since the question was like so.

like image 152
dougMcarthy Avatar answered Oct 22 '22 09:10

dougMcarthy


Or maybe this:

df.drop(i for i in df.Frequency if i >= 6)

Or use inplace:

df.drop((i for i in df.Frequency if i >= 6), inplace=True)
like image 45
Anton vBR Avatar answered Oct 22 '22 10:10

Anton vBR


For this sort of selection, you can maintain a fluent interface and use method-chaining by using the query method:

>>> df.query('Frenquency < 6')
   Frenquency  lst2Tite  lst3Tite
0           0         0         0
1           1         1         1
2           2         2         2
3           3         3         3
4           4         4         4
5           5         5         5
>>>

So something like:

df.rename(<something>).query('Frenquency <6').assign(<something>)

Or more concretely:

>>> (df.rename(columns={'Frenquency':'F'})
...    .query('F < 6')
...    .assign(FF=lambda x: x.F**2))
   F  lst2Tite  lst3Tite  FF
0  0         0         0   0
1  1         1         1   1
2  2         2         2   4
3  3         3         3   9
4  4         4         4  16
5  5         5         5  25
like image 32
juanpa.arrivillaga Avatar answered Oct 22 '22 09:10

juanpa.arrivillaga