Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to move the timestamp bounds for datetime in pandas (working with historical data)?

I'm working with historical data, and have some very old dates that are outside the timestamp bounds for pandas. I've consulted the Pandas Time series/date functionality documentation, which has some information on out of bounds spans, but from this information, it still wasn't clear to me what, if anything I could do to convert my data into a datetime type.

I've also seen a few threads on Stack Overflow on this, but they either just point out the problem (i.e. nanoseconds, max range 570-something years), or suggest setting errors = coerce which turns 80% of my data into NaTs.

Is it possible to turn dates lower than the default Pandas lower bound into dates? Here's a sample of my data:

import pandas as pd

df = pd.DataFrame({'id': ['836', '655', '508', '793', '970', '1075', '1119', '969', '1166', '893'], 
                   'date': ['1671-11-25', '1669-11-22', '1666-05-15','1673-01-18','1675-05-07','1677-02-08','1678-02-08', '1675-02-15', '1678-11-28', '1673-12-23']})
like image 455
anguyen1210 Avatar asked Nov 01 '19 13:11

anguyen1210


People also ask

How do pandas deal with date time?

Pandas has a built-in function called to_datetime()that converts date and time in string format to a DateTime object. As you can see, the 'date' column in the DataFrame is currently of a string-type object. Thus, to_datetime() converts the column to a series of the appropriate datetime64 dtype.

How do I convert a timestamp to a datetime in pandas Dataframe?

You can use the following basic syntax to convert a timestamp to a datetime in a pandas DataFrame: timestamp. to_pydatetime () The following examples show how to use this function in practice. Example 1: Convert a Single Timestamp to a Datetime. The following code shows how to convert a single timestamp to a datetime:

Can pandas work with date and time?

I found Pandas is an amazing library that contains extensive capabilities and features for working with date and time. In this article, we will cover the following common datetime problems and should help you get started with data analysis.

What is the use of pandas to_datetime () method?

The pandas to_datetime () method converts a date/time value stored in a DataFrame column into a DateTime object. Having date/time values as DateTime objects makes manipulating them much easier.

Where can I learn more about time series data in pandas?

If you’d like to learn more about working with time-series data in pandas, you can check out the Time Series Analysis with Pandas tutorial on the Dataquest blog, and of course, the Pandas documentation on Time series / date functionality. Mehdi is a Senior Data Engineer and Team Lead at ADA.


Video Answer


1 Answers

You can create day periods by lambda function:

df['date'] = df['date'].apply(lambda x: pd.Period(x, freq='D'))

Or like mentioned @Erfan in comment (thank you):

df['date'] = df['date'].apply(pd.Period)

print (df)
     id        date
0   836  1671-11-25
1   655  1669-11-22
2   508  1666-05-15
3   793  1673-01-18
4   970  1675-05-07
5  1075  1677-02-08
6  1119  1678-02-08
7   969  1675-02-15
8  1166  1678-11-28
9   893  1673-12-23
like image 110
jezrael Avatar answered Jan 03 '23 16:01

jezrael