The problem is in line 22 :
if start_date <= data_entries.iloc[j, 1] <= end_date: 
where I want to compare the start_date and end_date portion to data_entries.iloc[j, 1] which is accessing a column of the pandas dataframe. I converted the column to datetime using,
data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y") 
But I am unsure how to convert it to date.
import pandas as pd
import datetime
entries_csv = "C:\\Users\\Pops\\Desktop\\Entries.csv"
data_entries = pd.read_csv(entries_csv)
data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y")
start_date = datetime.date(2018, 4, 1)
end_date = datetime.date(2018, 10, 30)
    for j in range(0, len(data_entries)):
        if start_date <= data_entries.iloc[j, 1] <= end_date:
             print('Hello')
                Just use pd.Timestamp objects without any conversion:
start_date = pd.Timestamp('2018-04-01')
end_date = pd.Timestamp('2018-10-30')
res = data_entries[data_entries['VOUCHER DATE'].between(start_date, end_date)]
Explanation
Don't use datetime.datetime or datetime.date objects in Pandas series. This is inefficient because you lose vectorised functionality. The benefit of pd.Timestamp objects is you can utilize vectorised functionality for calculations. As described here:
numpy.datetime64is essentially a thin wrapper an int64. It has almost no date/time specific functionality.
pd.Timestampis a wrapper around a numpy.datetime64. It is backed by the same int64 value, but supports the entiredatetime.datetimeinterface, along with useful pandas-specific functionality.
this converts it to date:
data_entries['VOUCHER DATE'] = pd.to_datetime(data_entries['VOUCHER DATE'], format="%m/%d/%Y").dt.date
however i would not recommend filtering like this. this is much faster
data_entries[data_entries['VOUCHER DATE'].between(start_date, end_date)]
read this article
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With