Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

python - pandas - check if date exists in dataframe

I have a dataframe like this:

      category  date            number
0      Cat1     2010-03-01      1
1      Cat2     2010-09-01      1
2      Cat3     2010-10-01      1
3      Cat4     2010-12-01      1
4      Cat5     2012-04-01      1
5      Cat2     2013-02-01      1
6      Cat3     2013-07-01      1
7      Cat4     2013-11-01      2
8      Cat5     2014-11-01      5
9      Cat2     2015-01-01      1
10     Cat3     2015-03-01      1

I would like to check if a date is exist in this dataframe but I am unable to. I tried various ways as below but still no use:

if pandas.Timestamp("2010-03-01 00:00:00", tz=None) in df['date'].values:
    print 'date exist'

if datetime.strptime('2010-03-01', '%Y-%m-%d') in df['date'].values:
    print 'date exist'

if '2010-03-01' in df['date'].values:
    print 'date exist'  

The 'date exist' never got printed. How could I check if the date exist? Because I want to insert the none-existed date with number equals 0 to all the categories so that I could plot a continuously line chart (one category per line). Help is appreciated. Thanks in advance.

The last one gives me this: FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison And the date exist not get printed.

like image 460
Leo Avatar asked Oct 06 '16 10:10

Leo


People also ask

How do you check if a value exists in a data frame?

You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd. series() , in operator, pandas. series. isin() , str.

How do you check if something is a Dataframe in Python?

Using isinstance() method. It is used to check particular data is RDD or dataframe. It returns the boolean value.

How do you check if a value is in a series pandas?

isin() function check whether values are contained in Series. It returns a boolean Series showing whether each element in the Series matches an element in the passed sequence of values exactly.


1 Answers

I think you need convert to datetime first by to_datetime and then if need select all rows use boolean indexing:

df.date = pd.to_datetime(df.date)

print (df.date == pd.Timestamp("2010-03-01 00:00:00"))
0      True
1     False
2     False
3     False
4     False
5     False
6     False
7     False
8     False
9     False
10    False
Name: date, dtype: bool

print (df[df.date == pd.Timestamp("2010-03-01 00:00:00")])
  category       date  number
0     Cat1 2010-03-01       1

For return True use check value converted to numpy array by values:

if ('2010-03-01' in df['date'].values):
    print ('date exist')

Or at least one True by any as comment Edchum:

if (df.date == pd.Timestamp("2010-03-01 00:00:00")).any():
    print ('date exist')  
like image 78
jezrael Avatar answered Sep 29 '22 12:09

jezrael