infer_datetime_format with parse_date taking more time

Question

I was going through pandas documentation. And it quoted that enter image description here

I have a sample csv data file.

Next I tried

In [174]: %timeit df = pd.read_csv("a.csv", parse_dates=["Date"])
1.5 ms ± 178 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

In [175]: %timeit df = pd.read_csv("a.csv", parse_dates=["Date"], infer_datetime_format=True)
1.73 ms ± 45 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

So, according to the documentation it should be less time. Is my understanding correct? Or on what data does the statement hold good?

Update: Pandas version - '1.0.5'

Serge de Gosson de Varennes · Accepted Answer

What you actually want to do is add dayfirst = True

%timeit df = pd.read_csv("C:/Users/k_sego/Dates.csv", parse_dates=["Date"],dayfirst = True, infer_datetime_format=True)
1.96 ms ± 115 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

Compared to

%timeit df = pd.read_csv("C:/Users/k_sego/Dates.csv", parse_dates=["Date"])
2.38 ms ± 182 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)

and

%timeit df = pd.read_csv("C:/Users/k_sego/Dates.csv", parse_dates=["Date"], infer_datetime_format=True)
3.02 ms ± 670 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)

The solution is to reduce the number of choices read_csv has to do things.

infer_datetime_format with parse_date taking more time

Tags:

python

python-3.x

pandas

python-datetime

bigbounty

1 Answers

Serge de Gosson de Varennes

Recent Activity

Donate For Us

infer_datetime_format with parse_date taking more time

Tags:

python

python-3.x

pandas

python-datetime

bigbounty

1 Answers

Serge de Gosson de Varennes

Related questions

Recent Activity

Donate For Us