Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Difference between data type 'datetime64[ns]' and '<M8[ns]'?

I have created a TimeSeries in pandas:

In [346]: from datetime import datetime  In [347]: dates = [datetime(2011, 1, 2), datetime(2011, 1, 5), datetime(2011, 1, 7),   .....: datetime(2011, 1, 8), datetime(2011, 1, 10), datetime(2011, 1, 12)]  In [348]: ts = Series(np.random.randn(6), index=dates)  In [349]: ts  Out[349]:   2011-01-02 0.690002  2011-01-05 1.001543  2011-01-07 -0.503087  2011-01-08 -0.622274  2011-01-10 -0.921169  2011-01-12 -0.726213 

I'm following on the example from 'Python for Data Analysis' book.

In the following paragraph, the author checks the index type:

In [353]: ts.index.dtype  Out[353]: dtype('datetime64[ns]') 

When I do exactly the same operation in the console I get:

ts.index.dtype dtype('<M8[ns]') 

What is the difference between two types 'datetime64[ns]' and '<M8[ns]' ?

And why do I get a different type?

like image 406
LLaP Avatar asked Mar 23 '15 09:03

LLaP


People also ask

What is type datetime64 ns?

datetime64[ns] is a general dtype, while <M8[ns] is a specific dtype. General dtypes map to specific dtypes, but may be different from one installation of NumPy to the next.

What is datetime64?

New in version 1.7. 0. Starting in NumPy 1.7, there are core array data types which natively support datetime functionality. The data type is called datetime64 , so named because datetime is already taken by the Python standard library.


2 Answers

datetime64[ns] is a general dtype, while <M8[ns] is a specific dtype. General dtypes map to specific dtypes, but may be different from one installation of NumPy to the next.

On a machine whose byte order is little endian, there is no difference between np.dtype('datetime64[ns]') and np.dtype('<M8[ns]'):

In [6]: np.dtype('datetime64[ns]') == np.dtype('<M8[ns]') Out[6]: True 

However, on a big endian machine, np.dtype('datetime64[ns]') would equal np.dtype('>M8[ns]').

So datetime64[ns] maps to either <M8[ns] or >M8[ns] depending on the endian-ness of the machine.

There are many other similar examples of general dtypes mapping to specific dtypes: int64 maps to <i8 or >i8, and int maps to either int32 or int64 depending on the bit architecture of the OS and how NumPy was compiled.


Apparently, the repr of the datetime64 dtype has change since the time the book was written to show the endian-ness of the dtype.

like image 151
unutbu Avatar answered Oct 10 '22 18:10

unutbu


A bit of background will help understand the nuances of the output.

Numpy has an elaborate hierarchy of data types. The type information is stored as attributes in a data type object, which is an instance of numpy.dtype class. It describes how the bytes in the fixed-size block of memory corresponding to an array item should be interpreted (order of bytes, number of bytes, etc.).

Create an instance of the dtype

In [1]: import numpy as np  In [2]: dt = np.datetime64('1980', 'ns')  In [3]: dt Out[3]: numpy.datetime64('1980-01-01T00:00:00.000000000')  In [4]: dt.dtype Out[4]: dtype('<M8[ns]') 

Examine the attributes

In [5]: dt.dtype.char Out[5]: 'M'  In [6]: dt.dtype.name Out[6]: 'datetime64[ns]'  In [7]: dt.dtype.str Out[7]: '<M8[ns]'  In [8]: dt.dtype.type Out[8]: numpy.datetime64  

repr and str are string representations of an object, and each can have a different output for the same underlying data type.

In [9]: repr(dt.dtype) Out[9]: "dtype('<M8[ns]')"  In [10]: str(dt.dtype) Out[10]: 'datetime64[ns]' 

An application (shell, console, debugger etc.) may invoke either one of them, so the output may look different for the same type.

As confusing as this is, there are still more nuances in terms of bit width, type aliases etc. See Data types in Python, Numpy and Pandas for the gory details.

like image 43
ap-osd Avatar answered Oct 10 '22 19:10

ap-osd