More efficient pandas python command to drop Nan rows?

Question

I have a DF called TI. I want to drop rows where BookDate is NaN. So I run:

TI = TI.dropna(subset=['#Book_Date'])

When I run this, the memory gets eaten up for some reason (I'm on a 100GB of RAM machine, and about 50% of the RAM is used to hold TI, when I run that dropna line, it goes to 100% usage and never finished executing the commmand). Is it making a whole new copy? TI is a 64 million row dataframe, so it needs to be more efficient.

PhysicalChemist · Accepted Answer

By far the best way to do this is to do this is through employment that the column must be finite. You'll need numpy for this.

from pandas import *
import numpy

TI = TI[np.isfinite(TI['#Book_Date'])]

More efficient pandas python command to drop Nan rows?

Tags:

python

pandas

nan

wolfsatthedoor

1 Answers

PhysicalChemist

Recent Activity

Donate For Us

More efficient pandas python command to drop Nan rows?

Tags:

python

pandas

nan

wolfsatthedoor

1 Answers

PhysicalChemist

Related questions

Recent Activity

Donate For Us