Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

datetime difference in python adjusted for night time

I have two datetime objects in python d1 and d2. I want take the time difference between them. I want something slightly more sophisticated than (d1 - d2): I want the time during the night to count less than the time during the day by a constant fraction c, e.g. one hour at night counts as only half an hour during day time.

Is there an easy way to this in python (pandas and/or numpy)?

Thanks!

Edit: Night time is say from 9pm to 7am. But ideally i am looking dor a solution where you can choose arbitrary weights for arbitrary periods during the day

like image 896
maroxe Avatar asked Apr 11 '17 05:04

maroxe


People also ask

How do I get the difference between two dates and hours in Python?

Use the strptime(date_str, format) function to convert a date string into a datetime object as per the corresponding format . To get the difference between two dates, subtract date2 from date1.

What is time offset in Python?

The utcoffset() function is used to return a timedelta object that represents the difference between the local time and UTC time. This function is used in used in the datetime class of module datetime. Here range of the utcoffset is “timedelta(hours=24) <= offset <= timedelta(hours=24)”.

How do you find the difference between two time objects in Python?

In Python, timedelta denotes a span of time. It's the difference between two date , time , or datetime objects. If you add or subtract two date , time , or datetime objects, you'll get a timedelta object. This timedelta object has useful attributes and methods that can help calculate the time difference.


5 Answers

This solution calculates the weighted number of full dates and then subtracts or adds any residual from the first and last dates. This does not account for any daylight savings effects.

import pandas as pd


def timediff(t1, t2):

    DAY_SECS = 24 * 60 * 60
    DUSK = pd.Timedelta("21h")
    # Dawn is chosen as 7 a.m.
    FRAC_NIGHT = 10 / 24
    FRAC_DAY = 14 / 24
    DAY_WEIGHT = 1
    NIGHT_WEIGHT = 0.5

    full_days = ((t2.date() - t1.date()).days * DAY_SECS *
                 (FRAC_NIGHT * NIGHT_WEIGHT + FRAC_DAY * DAY_WEIGHT))

    def time2dusk(t):
        time = (pd.Timestamp(t.date()) + DUSK) - t
        time = time.total_seconds()
        wtime = (min(time * NIGHT_WEIGHT, 0) +
                 min(max(time, 0), FRAC_DAY * DAY_SECS) * DAY_WEIGHT +
                 max(time - DAY_SECS * FRAC_DAY, 0) * NIGHT_WEIGHT)
        return wtime

    t1time2dusk = time2dusk(t1)
    t2time2dusk = time2dusk(t2)
    return full_days + t1time2dusk - t2time2dusk

This provides the solution in weighted seconds, but you can convert to whatever is convenient after

times = [(pd.Timestamp("20170101T12:00:00"), pd.Timestamp("20170101T15:00:00")),
         (pd.Timestamp("20170101T12:00:00"), pd.Timestamp("20170101T23:00:00")),
         (pd.Timestamp("20170101T12:00:00"), pd.Timestamp("20170102T12:00:00")),
         (pd.Timestamp("20170101T22:00:00"), pd.Timestamp("20170101T23:00:00")),
         (pd.Timestamp("20170101T22:00:00"), pd.Timestamp("20170102T05:00:00")),
         (pd.Timestamp("20170101T06:00:00"), pd.Timestamp("20170101T08:00:00"))]

exp_diff_hours = [3, 9 + 2*0.5, 9 + 10*0.5 + 5, 1*0.5, 7*0.5, 1 + 1*0.5]

for i, ts in enumerate(times):
    t1, t2 = ts
    print("\n")
    print("Time1: %s" % t1)
    print("Time2: %s" % t2)
    print("Weighted Time2 - Time1: %s" % (timediff(t1, t2) / 3600))
    print("Weighted Time2 - Time1 Expected: %s" % exp_diff_hours[i])

for i, ts in enumerate(times):
    t2, t1 = ts
    print("\n")
    print("Time1: %s" % t1)
    print("Time2: %s" % t2)
    print("Weighted Time2 - Time1: %s" % (timediff(t1, t2) / 3600))
    print("Weighted Time2 - Time1 Expected: %s" % -exp_diff_hours[i])

Time1: 2017-01-01 12:00:00
Time2: 2017-01-01 15:00:00
Weighted Time2 - Time1: 3.000000000000001
Weighted Time2 - Time1 Expected: 3


Time1: 2017-01-01 12:00:00
Time2: 2017-01-01 23:00:00
Weighted Time2 - Time1: 10.0
Weighted Time2 - Time1 Expected: 10.0


Time1: 2017-01-01 12:00:00
Time2: 2017-01-02 12:00:00
Weighted Time2 - Time1: 19.0
Weighted Time2 - Time1 Expected: 19.0


Time1: 2017-01-01 22:00:00
Time2: 2017-01-01 23:00:00
Weighted Time2 - Time1: 0.5
Weighted Time2 - Time1 Expected: 0.5


Time1: 2017-01-01 22:00:00
Time2: 2017-01-02 05:00:00
Weighted Time2 - Time1: 3.5
Weighted Time2 - Time1 Expected: 3.5


Time1: 2017-01-01 06:00:00
Time2: 2017-01-01 08:00:00
Weighted Time2 - Time1: 1.5
Weighted Time2 - Time1 Expected: 1.5


Time1: 2017-01-01 15:00:00
Time2: 2017-01-01 12:00:00
Weighted Time2 - Time1: -3.000000000000001
Weighted Time2 - Time1 Expected: -3


Time1: 2017-01-01 23:00:00
Time2: 2017-01-01 12:00:00
Weighted Time2 - Time1: -10.0
Weighted Time2 - Time1 Expected: -10.0


Time1: 2017-01-02 12:00:00
Time2: 2017-01-01 12:00:00
Weighted Time2 - Time1: -19.0
Weighted Time2 - Time1 Expected: -19.0


Time1: 2017-01-01 23:00:00
Time2: 2017-01-01 22:00:00
Weighted Time2 - Time1: -0.5
Weighted Time2 - Time1 Expected: -0.5


Time1: 2017-01-02 05:00:00
Time2: 2017-01-01 22:00:00
Weighted Time2 - Time1: -3.5
Weighted Time2 - Time1 Expected: -3.5


Time1: 2017-01-01 08:00:00
Time2: 2017-01-01 06:00:00
Weighted Time2 - Time1: -1.5
Weighted Time2 - Time1 Expected: -1.5
like image 176
mgilbert Avatar answered Oct 21 '22 20:10

mgilbert


Here is a solution.

It does two things, first it calculates the number of full days between the two date, and since we know (well, we can approximate) that each day is 24 hours, it's rather trivial to weight "day time" and "night time" (calculations are done in hours). So now we just have to figure out the remaining less than 24 hour interval. Here the trick is to "fold" the time so that "dawn" is not in the middle of a day, but at 0 so we only have a single delimiter for "dusk", so we only have three cases, both are day time, both are nighttime or the later date is nighttime and the earlier is day time.

Updated based on comments.

Runtime for 1 million function calls was 4.588s on my laptop.

from datetime import datetime,timedelta

def weighteddiff(d2,d1,dawn,dusk,night_weight):

    #if dusk is "before" dawn, switch roles
    day_weight = 1
    if dusk < dawn:
        day_weight = night_weight
        night_weight = 1
        placeholder = dawn
        dawn = dusk
        dusk = placeholder

    nighttime = dawn.total_seconds()/3600 + 24 - dusk.total_seconds()/3600
    daytime = 24-nighttime


    dt = d2-d1

    total_hours = 0
    total_hours += dt.days*daytime*day_weight + dt.days*nighttime*night_weight

    d1 += timedelta(days=dt.days)
    d1 -= dawn
    d2 -= dawn

    dawntime = datetime(d2.year,d2.month,d2.day,0)
    dusktime = dawntime + dusk - dawn

    if d1 < dusktime and d2 < dusktime:
        total_hours += (d2-d1).total_seconds()/3600*day_weight
    elif d1 < dusktime and d2 >= dusktime:
        total_hours += (dusktime - d1).total_seconds()/3600*day_weight
        total_hours += (d2 - dusktime).total_seconds()/3600*night_weight
    elif d1 >= dusktime and d2 >= dusktime:
        total_hours += (d2-d1).total_seconds()/3600*night_weight
    else:
        pass

    return total_hours


weight = 0.5 #weight of nightime hours

#dawn and dusk supplied as timedelta from midnight
dawn = timedelta(hours=5,minutes=0,seconds=0)
dusk = timedelta(hours=19,minutes=4,seconds=0)

d1 = datetime(2017,10,23, 14)
d2 = datetime(2017,10,23, 22)
print("test1",weighteddiff(d2,d1,dawn,dusk,weight))

d1 = datetime(2016,10,22, 20)
d2 = datetime(2016,10,23, 20) 
print("test2",weighteddiff(d2,d1,dawn,dusk,weight))

dawn = timedelta(hours=6,minutes=0,seconds=0)
dusk = timedelta(hours=1,minutes=4,seconds=0)

d1 = datetime(2017,10,22, 2)
d2 = datetime(2017,10,23, 19)
print("test3",weighteddiff(d2,d1,dawn,dusk,weight))

d1 = datetime(2016,10,22, 20)
d2 = datetime(2016,10,23, 20) 
print("test4",weighteddiff(d2,d1,dawn,dusk,weight))
like image 37
fbence Avatar answered Oct 21 '22 20:10

fbence


Below are two approaches. I assumed the second would be faster on large date ranges (e.g. 5 years apart) but it turns out the first one is:

  1. loops through all the minutes between your datetimes
  2. creates a date-range series, then a series of weights (using np.where() conditional logic) and sums them

Approach 1: Loop through minutes and update weighted-timedelta.
4.2 seconds (Laptop runtime on 5-year dt range)

import datetime    
def weighted_timedelta(start_dt, end_dt,
                       nights_start = datetime.time(21,0),
                       nights_end   = datetime.time(7,0),
                       night_weight = 0.5):

    # initialize counters
    weighted_timedelta = 0
    i = start_dt

    # loop through minutes in datetime-range, updating weighted_timedelta
    while i <= end_dt:
        i += timedelta(minutes=1)

        if i.time() >= nights_start or i.time() <= nights_end:
            weighted_timedelta += night_weight
        else:
            weighted_timedelta += 1

    return weighted_timedelta

Approach 2: Create Pandas a Series of weights using date_range & np.where().
15 seconds (Laptop runtime on 5-year dt range)

def weighted_timedelta(start_dt, end_dt,
                       nights_start = datetime.time(21,0),
                       nights_end   = datetime.time(7,0),
                       night_weight = 0.5):

    # convert dts to pandas date-range series, minute-resolution
    dt_range = pd.date_range(start=start_dt, end=end_dt, freq='min')

    # Assign 'weight' as -night_weight- or 1, for each minute, depeding on day/night
    dt_weights = np.where((dt_range2.time >= nights_start) |  # | is bitwise 'or' for arrays of booleans
                          (dt_range2.time <= nights_end), 
                          night_weight, 1)

    # return value as weighted minutes
    return dt_weights.sum()

Each were also tested for accuracy with:

d1 = datetime.datetime(2016,1,22,20,30)
d2 = datetime.datetime(2016,1,22,21,30)

weighted_timedelta(d1, d2)
45.0
like image 42
Max Power Avatar answered Oct 21 '22 20:10

Max Power


A solution letting you define as many periods as you want, with their respective weights.

First, a helper function slicing the interval between our datetimes:

from datetime import date, time, datetime, timedelta

def slice_datetimes_interval(start, end):
    """
    Slices the interval between the datetimes start and end.

    If start and end are on different days:
    start time -> midnight | number of full days | midnight -> end time
    ----------------------   -------------------   --------------------
               ^                     ^                      ^
          day_part_1             full_days              day_part_2

    If start and end are on the same day:
    start time -> end time
    ----------------------
              ^
         day_part_1              full_days = 0

    Returns full_days and the list of day_parts (as tuples of time objects).
    """

    if start > end:
        raise ValueError("Start time must be before end time")

    # Number of full days between the end of start day and the beginning of end day
    # If start and end are on the same day, it will be -1
    full_days = (datetime.combine(end, time.min) - 
                 datetime.combine(start, time.max)).days
    if full_days >= 0:
        day_parts = [(start.time(), time.max),
                     (time.min, end.time())]
    else:
        full_days = 0
        day_parts = [(start.time(), end.time())]

    return full_days, day_parts

The class calculating weighted durations for a given list of periods and weights:

class WeightedDuration:
    def __init__(self, periods):
        """
        periods is a list of tuples (start_time, end_time, weight)
        where start_time and end_time are datetime.time objects.

        For a period including midnight, like 22:00 -> 6:30,
        we create two periods:
          - midnight (start of day) -> 6:30,
          - 22:00 -> midnight(end of day)

        so periods will be:
          [(time.min, time(6, 30), 0.5),
           (time(22, 0), time.max, 0.5)]

        """
        self.periods = periods
        # We store the weighted duration of a whole day for later reuse
        self.day_duration = self.time_interval_duration(time.min, time.max)

    def time_interval_duration(self, start_time, end_time):
        """ 
        Returns the weighted duration, in seconds, between the datetime.time objects
        start_time and end_time - so, two times on the *same* day.
        """
        dummy_date = date(2000, 1, 1)

        # First, we calculate the total duration, *without weight*.
        # time objects can't be substracted, so
        # we turn them into datetimes on dummy_date
        duration = (datetime.combine(dummy_date, end_time) -
                    datetime.combine(dummy_date, start_time)).total_seconds()

        # Then, we calculate the reductions during all periods
        # intersecting our interval
        reductions = 0
        for period in self.periods:
            period_start, period_end, weight = period
            if period_end < start_time or period_start > end_time:
                # the period and our interval don't intersect
                continue

            # Intersection of the period and our interval
            start = max(start_time, period_start)
            end = min (end_time, period_end)

            reductions += ((datetime.combine(dummy_date, end) -
                           datetime.combine(dummy_date, start)).total_seconds()
                           * (1 - weight))
        # as time.max is midnight minus a µs, we round the result
        return round(duration - reductions)

    def duration(self, start, end):
        """
        Returns the weighted duration, in seconds, between the datetime.datetime
        objects start and end.
        """
        full_days, day_parts = slice_datetimes_interval(start, end)
        dur = full_days * self.day_duration
        for day_part in day_parts:
            dur += self.time_interval_duration(*day_part)
        return dur

We create a WeightedDuration instance, defining our periods and their weights. We can have as many periods as we want, with weights smaller or greater than 1.

wd = WeightedDuration([(time.min, time(7, 0), 0.5),      # from midnight to 7, 50%
                       (time(12, 0), time(13, 0), 0.75), # from 12 to 13, 75%
                       (time(21, 0), time.max, 0.5)])    # from 21 to midnight, 50%

Let's calculate the weighted duration between datetimes:

# 1 hour at 50%, 1 at 100%: that should be 3600 + 1800 = 5400 s
print(wd.duration(datetime(2017, 1, 3, 6, 0), datetime(2017, 1, 3, 8)))
# 5400

# a few tests
intervals = [
    (datetime(2017, 1, 3, 9, 0), datetime(2017, 1, 3, 10)),  # 1 hour with weight 1
    (datetime(2017, 1, 3, 23, 0), datetime(2017, 1, 4, 1)),  # 2 hours, weight 0.5
    (datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 4, 5)),   # 1 full day
    (datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 3, 23)),  # same day
    (datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 4, 23)),  # next day
    (datetime(2017, 1, 3, 5, 0), datetime(2017, 1, 5, 23)),  # 1 full day in between
            ]
for interval in intervals:
    print(interval)
    print(wd.duration(*interval))  

# (datetime.datetime(2017, 1, 3, 9, 0), datetime.datetime(2017, 1, 3, 10, 0))
# 3600
# (datetime.datetime(2017, 1, 3, 23, 0), datetime.datetime(2017, 1, 4, 1, 0))
# 3600
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 4, 5, 0))
# 67500
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 3, 23, 0))
# 56700
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 4, 23, 0))
# 124200
# (datetime.datetime(2017, 1, 3, 5, 0), datetime.datetime(2017, 1, 5, 23, 0))
# 191700
like image 21
Thierry Lathuille Avatar answered Oct 21 '22 22:10

Thierry Lathuille


try this code:

from pandas import date_range
from pandas import Series
from datetime import datetime
from datetime import time
from dateutil.relativedelta import relativedelta

# initial date
d1 = datetime(2017, 1, 1, 8, 0, 0)
d2 = d1 + relativedelta(days=10)
print d1, d1

method 1: slow but easy to understand.

ts = Series(1, date_range(d1, d2, freq='S'))
c1 = ts.index.time >= time(21, 0, 0)
c2 = ts.index.time < time(7, 0, 0)
ts[c1 | c2] = .5
ts.iloc[-1] = 0
print ts.sum()   # result in seconds

method 2: faster, but a bit complicated

def get_seconds(ti):
    ts = Series(1, ti)
    c1 = ts.index.time >= time(21, 0, 0)
    c2 = ts.index.time < time(7, 0, 0)
    ts[c1 | c2] = .5
    ts.iloc[-1] = 0
    return ts.sum() * ti.freq.delta.seconds

ti0 = date_range(d1, d2, freq='H', normalize=True)
ti1 = date_range(ti0[0], d1, freq='S')
ti2 = date_range(ti0[-1], d2, freq='S')
print get_seconds(ti0) - get_seconds(ti1) + get_seconds(ti2) # result in seconds
like image 40
xmduhan Avatar answered Oct 21 '22 21:10

xmduhan