Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

how to remove a row which has empty column in a dataframe using pandas

Tags:

python

pandas

I have to remove entire row with the column, which has no value my dataframe looks like

Name   place    phonenum

mike   china     12344
       ireland    897654
suzzi  japan      09876
chang  china      897654
       Australia  897654
       india      876543

required output should be

Name   place    phonenum

mike   china     12344
suzzi  japan      09876
chang  china      897654

I have used df1=df[df.Name == ''] I got output

  Name   place    phonenum

Please help me

like image 351
tiru Avatar asked Jul 17 '18 05:07

tiru


People also ask

How do you delete a blank row in Python?

Drop Empty Rows or Columns If you're looking to drop rows (or columns) containing empty data, you're in luck: Pandas' dropna() method is specifically for this. Technically you could run df. dropna() without any parameters, and this would default to dropping all rows where are completely empty.

How do you delete entire row if values in a column are NaN?

To remove rows and columns containing missing values NaN in NumPy array numpy. ndarray , check NaN with np. isnan() and extract rows and columns that do not contain NaN with any() or all() .

How do you drop a row with null values?

In order to drop a null values from a dataframe, we used dropna() function this function drop Rows/Columns of datasets with Null values in different ways. Parameters: axis: axis takes int or string value for rows/columns. Input can be 0 or 1 for Integer and 'index' or 'columns' for String.


2 Answers

If Name is column:

print (df.columns)
Index(['Name', 'place', 'phonenum'], dtype='object')

Need change == to != for not equal if missing values are empty strings:

print (df)
    Name      place  phonenum
0   mike      china     12344
1           ireland    897654
2  suzzi      japan      9876
3  chang      china    897654
4         Australia    897654
5             india    876543

df1 = df[df.Name != '']
print (df1)
    Name  place  phonenum
0   mike  china     12344
2  suzzi  japan      9876
3  chang  china    897654

If in first columns are NaNs use dropna with specify column for check:

print (df)
    Name      place  phonenum
0   mike      china     12344
1    NaN    ireland    897654
2  suzzi      japan      9876
3  chang      china    897654
4    NaN  Australia    897654
5    NaN      india    876543

df1 = df.dropna(subset=['Name'])
print (df1)
    Name  place  phonenum
0   mike  china     12344
2  suzzi  japan      9876
3  chang  china    897654
like image 162
jezrael Avatar answered Oct 04 '22 12:10

jezrael


In my case, I had a bunch of fields with dates, strings, and one column for values (also called "Value"). I tried all suggestions above, but what actually worked was to drop NA records for the "Value" field.

df = df.dropna(subset=['Value'])

like image 40
StudentAtLU Avatar answered Oct 04 '22 13:10

StudentAtLU