Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

TypeError: Cannot compare type 'Timestamp' with type 'date'

The problem is in line 22 :

if start_date <= data_entries.iloc[j, 1] <= end_date:

where I want to compare the start_date and end_date portion to data_entries.iloc[j, 1] which is accessing a column of the pandas dataframe. I converted the column to datetime using,

data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y")

But I am unsure how to convert it to date.

import pandas as pd
import datetime

entries_csv = "C:\\Users\\Pops\\Desktop\\Entries.csv"

data_entries = pd.read_csv(entries_csv)
data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y")

start_date = datetime.date(2018, 4, 1)
end_date = datetime.date(2018, 10, 30)

    for j in range(0, len(data_entries)):
        if start_date <= data_entries.iloc[j, 1] <= end_date:
             print('Hello')
like image 874
Pherdindy Avatar asked Jul 23 '18 08:07

Pherdindy


2 Answers

Just use pd.Timestamp objects without any conversion:

start_date = pd.Timestamp('2018-04-01')
end_date = pd.Timestamp('2018-10-30')

res = data_entries[data_entries['VOUCHER DATE'].between(start_date, end_date)]

Explanation

Don't use datetime.datetime or datetime.date objects in Pandas series. This is inefficient because you lose vectorised functionality. The benefit of pd.Timestamp objects is you can utilize vectorised functionality for calculations. As described here:

numpy.datetime64 is essentially a thin wrapper an int64. It has almost no date/time specific functionality.

pd.Timestamp is a wrapper around a numpy.datetime64. It is backed by the same int64 value, but supports the entire datetime.datetime interface, along with useful pandas-specific functionality.

like image 140
jpp Avatar answered Nov 19 '22 02:11

jpp


this converts it to date:

data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y").dt.date

however i would not recommend filtering like this. this is much faster

data_entries[data_entries['VOUCHER DATE'].between(start_date, end_date)]

read this article

like image 8
moshevi Avatar answered Nov 19 '22 02:11

moshevi