Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

computing the mean for python datetime

I have a datetime attribute:

d = {
    'DOB': pd.Series([
        datetime.datetime(2014, 7, 9),
        datetime.datetime(2014, 7, 15),
        np.datetime64('NaT')
    ], index=['a', 'b', 'c'])
}
df_test = pd.DataFrame(d)

I would like to compute the mean for that attribute. Running mean() causes an error:

TypeError: reduction operation 'mean' not allowed for this dtype

I also tried the solution proposed elsewhere. It doesn't work as running the function proposed there causes

OverflowError: Python int too large to convert to C long

What would you propose? The result for the above dataframe should be equivalent to

datetime.datetime(2014, 7, 12).
like image 675
Nick Avatar asked May 15 '18 20:05

Nick


People also ask

What does datetime mean in Python?

datetime in Python is the combination between dates and times. The attributes of this class are similar to both date and separate classes. These attributes include day, month, year, minute, second, microsecond, hour, and tzinfo.

What is datetime datetime now () in Python?

Here, we have used datetime.now() to get the current date and time. Then, we used strftime() to create a string representing date and time in another format.


2 Answers

You can take the mean of Timedelta. So find the minimum value and subtract it from the series to get a series of Timedelta. Then take the mean and add it back to the minimum.

dob = df_test.DOB
m = dob.min()
(m + (dob - m).mean()).to_pydatetime()

datetime.datetime(2014, 7, 12, 0, 0)

One-line

df_test.DOB.pipe(lambda d: (lambda m: m + (d - m).mean())(d.min())).to_pydatetime()

To @ALollz point

I use the epoch pd.Timestamp(0) instead of min

df_test.DOB.pipe(lambda d: (lambda m: m + (d - m).mean())(pd.Timestamp(0))).to_pydatetime()
like image 107
piRSquared Avatar answered Oct 22 '22 15:10

piRSquared


You can convert epoch time using astype with np.int64 and converting back to datetime with pd.to_datetime:

pd.to_datetime(df_test.DOB.dropna().astype(np.int64).mean())

Output:

Timestamp('2014-07-12 00:00:00')
like image 5
Scott Boston Avatar answered Oct 22 '22 16:10

Scott Boston