I have a dataframe like this:
category date number
0 Cat1 2010-03-01 1
1 Cat2 2010-09-01 1
2 Cat3 2010-10-01 1
3 Cat4 2010-12-01 1
4 Cat5 2012-04-01 1
5 Cat2 2013-02-01 1
6 Cat3 2013-07-01 1
7 Cat4 2013-11-01 2
8 Cat5 2014-11-01 5
9 Cat2 2015-01-01 1
10 Cat3 2015-03-01 1
I would like to check if a date is exist in this dataframe but I am unable to. I tried various ways as below but still no use:
if pandas.Timestamp("2010-03-01 00:00:00", tz=None) in df['date'].values:
print 'date exist'
if datetime.strptime('2010-03-01', '%Y-%m-%d') in df['date'].values:
print 'date exist'
if '2010-03-01' in df['date'].values:
print 'date exist'
The 'date exist' never got printed. How could I check if the date exist? Because I want to insert the none-existed date with number equals 0 to all the categories so that I could plot a continuously line chart (one category per line). Help is appreciated. Thanks in advance.
The last one gives me this:
FutureWarning: elementwise comparison failed; returning scalar instead, but in the future will perform elementwise comparison
And the date exist
not get printed.
You can check if a column contains/exists a particular value (string/int), list of multiple values in pandas DataFrame by using pd. series() , in operator, pandas. series. isin() , str.
Using isinstance() method. It is used to check particular data is RDD or dataframe. It returns the boolean value.
isin() function check whether values are contained in Series. It returns a boolean Series showing whether each element in the Series matches an element in the passed sequence of values exactly.
I think you need convert to datetime first by to_datetime
and then if need select all rows use boolean indexing
:
df.date = pd.to_datetime(df.date)
print (df.date == pd.Timestamp("2010-03-01 00:00:00"))
0 True
1 False
2 False
3 False
4 False
5 False
6 False
7 False
8 False
9 False
10 False
Name: date, dtype: bool
print (df[df.date == pd.Timestamp("2010-03-01 00:00:00")])
category date number
0 Cat1 2010-03-01 1
For return True
use check value converted to numpy array
by values
:
if ('2010-03-01' in df['date'].values):
print ('date exist')
Or at least one True
by any
as comment Edchum:
if (df.date == pd.Timestamp("2010-03-01 00:00:00")).any():
print ('date exist')
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With