Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to remove a row from pandas dataframe based on the length of the column values?

In the following pandas.DataFframe:

df = 
    alfa    beta   ceta
    a,b,c   c,d,e  g,e,h
    a,b     d,e,f  g,h,k
    j,k     c,k,l  f,k,n

How to drop the rows in which the column values for alfa has more than 2 elements? This can be done using the length function, I know but not finding a specific answer.

df = df[['alfa'].str.split(',').map(len) < 3]
like image 943
everestial007 Avatar asked Mar 20 '17 02:03

everestial007


People also ask

How do you drop rows in a DataFrame by conditions on column values?

Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).

How do I remove a specific row from a DataFrame in Python?

To drop a row or column in a dataframe, you need to use the drop() method available in the dataframe. You can read more about the drop() method in the docs here. Rows are labelled using the index number starting with 0, by default. Columns are labelled using names.


1 Answers

You can do that test to each row in turn using pandas.DataFrame.apply()

print(df[df['alfa'].apply(lambda x: len(x.split(',')) < 3)])

Gives:

  alfa   beta   ceta
1  a,b  d,e,f  g,h,k
2  j,k  c,k,l  f,k,n
like image 77
Stephen Rauch Avatar answered Sep 28 '22 04:09

Stephen Rauch