I'm working with historical data, and have some very old dates that are outside the timestamp bounds for pandas. I've consulted the Pandas Time series/date functionality documentation, which has some information on out of bounds spans, but from this information, it still wasn't clear to me what, if anything I could do to convert my data into a datetime
type.
I've also seen a few threads on Stack Overflow on this, but they either just point out the problem (i.e. nanoseconds, max range 570-something years), or suggest setting errors = coerce
which turns 80% of my data into NaT
s.
Is it possible to turn dates lower than the default Pandas lower bound into dates? Here's a sample of my data:
import pandas as pd
df = pd.DataFrame({'id': ['836', '655', '508', '793', '970', '1075', '1119', '969', '1166', '893'],
'date': ['1671-11-25', '1669-11-22', '1666-05-15','1673-01-18','1675-05-07','1677-02-08','1678-02-08', '1675-02-15', '1678-11-28', '1673-12-23']})
Pandas has a built-in function called to_datetime()that converts date and time in string format to a DateTime object. As you can see, the 'date' column in the DataFrame is currently of a string-type object. Thus, to_datetime() converts the column to a series of the appropriate datetime64 dtype.
You can use the following basic syntax to convert a timestamp to a datetime in a pandas DataFrame: timestamp. to_pydatetime () The following examples show how to use this function in practice. Example 1: Convert a Single Timestamp to a Datetime. The following code shows how to convert a single timestamp to a datetime:
I found Pandas is an amazing library that contains extensive capabilities and features for working with date and time. In this article, we will cover the following common datetime problems and should help you get started with data analysis.
The pandas to_datetime () method converts a date/time value stored in a DataFrame column into a DateTime object. Having date/time values as DateTime objects makes manipulating them much easier.
If you’d like to learn more about working with time-series data in pandas, you can check out the Time Series Analysis with Pandas tutorial on the Dataquest blog, and of course, the Pandas documentation on Time series / date functionality. Mehdi is a Senior Data Engineer and Team Lead at ADA.
You can create day periods by lambda function:
df['date'] = df['date'].apply(lambda x: pd.Period(x, freq='D'))
Or like mentioned @Erfan in comment (thank you):
df['date'] = df['date'].apply(pd.Period)
print (df)
id date
0 836 1671-11-25
1 655 1669-11-22
2 508 1666-05-15
3 793 1673-01-18
4 970 1675-05-07
5 1075 1677-02-08
6 1119 1678-02-08
7 969 1675-02-15
8 1166 1678-11-28
9 893 1673-12-23
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With