Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Drop rows if value in a specific column is not an integer in pandas dataframe

Tags:

python

pandas

If I have a dataframe and want to drop any rows where the value in one column is not an integer how would I do this?

The alternative is to drop rows if value is not within a range 0-2 but since I am not sure how to do either of them I was hoping someonelse might.

Here is what I tried but it didn't work not sure why:

df = df[(df['entrytype'] != 0) | (df['entrytype'] !=1) | (df['entrytype'] != 2)].all(1)
like image 520
azuric Avatar asked Feb 13 '15 12:02

azuric


People also ask

How do I drop rows in pandas based on column condition?

Use pandas. DataFrame. drop() method to delete/remove rows with condition(s).

How do you drop a row from a certain value in Python?

We can use the column_name function along with the operator to drop the specific value.

How do you drop a row with a certain value?

One of the fastest ways to delete rows that contain a specific value or fulfill a given condition is to filter these. Once you have the filtered data, you can delete all these rows (while the remaining rows remain intact).


3 Answers

There are 2 approaches I propose:

In [212]:

df = pd.DataFrame({'entrytype':[0,1,np.NaN, 'asdas',2]})
df
Out[212]:
  entrytype
0         0
1         1
2       NaN
3     asdas
4         2

If the range of values is as restricted as you say then using isin will be the fastest method:

In [216]:

df[df['entrytype'].isin([0,1,2])]
Out[216]:
  entrytype
0         0
1         1
4         2

Otherwise we could cast to a str and then call .isdigit()

In [215]:

df[df['entrytype'].apply(lambda x: str(x).isdigit())]
Out[215]:
  entrytype
0         0
1         1
4         2
like image 68
EdChum Avatar answered Sep 28 '22 09:09

EdChum


str("-1").isdigit() is False

str("-1").lstrip("-").isdigit() works but is not nice.


df.loc[df['Feature'].str.match('^[+-]?\d+$')]

for your question the reverse set

df.loc[ ~(df['Feature'].str.match('^[+-]?\d+$')) ]

like image 33
InLaw Avatar answered Sep 28 '22 10:09

InLaw


We have multiple ways to do the same, but I found this method easy and efficient.

Quick Examples

#Using drop() to delete rows based on column value
df.drop(df[df['Fee'] >= 24000].index, inplace = True)

# Remove rows
df2 = df[df.Fee >= 24000]

# If you have space in column name
# Specify column name with in single quotes
df2 = df[df['column name']]

# Using loc
df2 = df.loc[df["Fee"] >= 24000 ]

# Delect rows based on multiple column value
df2 = df[ (df['Fee'] >= 22000) & (df['Discount'] == 2300)]

# Drop rows with None/NaN
df2 = df[df.Discount.notnull()]
like image 24
Sachin Avatar answered Sep 28 '22 10:09

Sachin