Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert float32 array to datetime64 in Numpy 1.6.1

What is the proper way of converting integer dates to datetime64 in numpy? I tried:

import numpy
a = numpy.array([20090913, 20101020, 20110125])
numpy.datetime64(a.astype("S8"))

but get an incorrect conversion. How about reading them in correctly as numpy.datetime64 objects using numpy.loadtxt (they are coming from a csv file)?

like image 504
Benjamin Avatar asked Mar 08 '12 20:03

Benjamin


People also ask

What is Numpy datetime64?

datetime64() method, we can get the date in a numpy array in a particular format i.e year-month-day by using numpy. datetime64() method. Syntax : numpy.datetime64(date) Return : Return the date in a format 'yyyy-mm-dd'.

How do I convert datetime64 to NS to string in Python?

Pandas Convert Date to String Format – To change/convert the pandas datetime ( datetime64[ns] ) from default format to String/Object or custom format use pandas. Series. dt. strftime() method.

How do you get all the dates corresponding to the month of July 2016 Numpy?

In NumPy to display all the dates for a particular month, we can do it with the help of NumPy. arrange() pass the first parameter the particular month and the second parameter the next month and the third parameter is the datatype datetime64[D]. It will return all the dates for the particular month.


2 Answers

You problem is that datetime64 expects a string in the format yyyy-mm-dd, while the type conversion produces strings in the format yyyymmdd. I would suggest something like this:

conversion = lambda x: "%s-%s-%s" % (x[:4], x[4:6], x[6:])
np_conversion = numpy.frompyfunc(conversion,1,1)
b = np_conversion(a.astype('S10'))
numpy.datetime64(b)

However it's not working for me (I have numpy 1.6.1), it fails with the message "NotImplementedError: Not implemented for this type". Unless that is implemented in 1.7, I can only suggest a pure Python solution:

numpy.datetime64(numpy.array([conversion(str(x)) for x in a], dtype="S10"))

...or pre-processing your input, to deliver the dates in the expected format.

Edit: I can also offer an alternative solution, using vectorize, but I don't know very well how it works, so I don't know what's going wrong:

>>> conversion = vectorize(lambda x: "%s-%s-%s" % (x[:4], x[4:6], x[6:]), otypes=['S10'])
>>> conversion(a.astype('S10'))
array(['2009', '2010', '2011'],
      dtype='|S4')

For some reason it's ignoring the otypes and outputting |S4 instead of |S10. Sorry I can't help more, but this should provide a starting point for searching other solutions.

Update: Thanks to OP feedback, I thought of a new possibility. This should work as expected:

>>> conversion = lambda x: numpy.datetime64(str(x))
>>> np_conversion = numpy.frompyfunc(conversion, 1, 1)
>>> np_conversion(a)
array([2009-09-13 00:00:00, 2010-10-20 00:00:00, 2011-01-25 00:00:00], dtype=object)

# Works too:
>>> conversion = lambda x: numpy.datetime64("%s-%s-%s" % (x/10000, x/100%100, x%100))

Weird how, in this case, datetime64 works fine with or without the dashes...

like image 96
mgibsonbr Avatar answered Oct 19 '22 09:10

mgibsonbr


Oddly, this works: numpy.datetime64(a.astype("S8").tolist()), while this does not: numpy.datetime64(a.astype("S8")). The first method is still a bit less convoluted than: numpy.array([numpy.datetime64(str(i)) for i in a]). I asked why in this question.

like image 33
Benjamin Avatar answered Oct 19 '22 09:10

Benjamin