What is the proper way of converting integer dates to datetime64 in numpy? I tried:
import numpy
a = numpy.array([20090913, 20101020, 20110125])
numpy.datetime64(a.astype("S8"))
but get an incorrect conversion. How about reading them in correctly as numpy.datetime64 objects using numpy.loadtxt (they are coming from a csv file)?
datetime64() method, we can get the date in a numpy array in a particular format i.e year-month-day by using numpy. datetime64() method. Syntax : numpy.datetime64(date) Return : Return the date in a format 'yyyy-mm-dd'.
Pandas Convert Date to String Format – To change/convert the pandas datetime ( datetime64[ns] ) from default format to String/Object or custom format use pandas. Series. dt. strftime() method.
In NumPy to display all the dates for a particular month, we can do it with the help of NumPy. arrange() pass the first parameter the particular month and the second parameter the next month and the third parameter is the datatype datetime64[D]. It will return all the dates for the particular month.
You problem is that datetime64
expects a string in the format yyyy-mm-dd
, while the type conversion produces strings in the format yyyymmdd
. I would suggest something like this:
conversion = lambda x: "%s-%s-%s" % (x[:4], x[4:6], x[6:])
np_conversion = numpy.frompyfunc(conversion,1,1)
b = np_conversion(a.astype('S10'))
numpy.datetime64(b)
However it's not working for me (I have numpy 1.6.1), it fails with the message "NotImplementedError: Not implemented for this type". Unless that is implemented in 1.7, I can only suggest a pure Python solution:
numpy.datetime64(numpy.array([conversion(str(x)) for x in a], dtype="S10"))
...or pre-processing your input, to deliver the dates in the expected format.
Edit: I can also offer an alternative solution, using vectorize
, but I don't know very well how it works, so I don't know what's going wrong:
>>> conversion = vectorize(lambda x: "%s-%s-%s" % (x[:4], x[4:6], x[6:]), otypes=['S10'])
>>> conversion(a.astype('S10'))
array(['2009', '2010', '2011'],
dtype='|S4')
For some reason it's ignoring the otypes
and outputting |S4
instead of |S10
. Sorry I can't help more, but this should provide a starting point for searching other solutions.
Update: Thanks to OP feedback, I thought of a new possibility. This should work as expected:
>>> conversion = lambda x: numpy.datetime64(str(x))
>>> np_conversion = numpy.frompyfunc(conversion, 1, 1)
>>> np_conversion(a)
array([2009-09-13 00:00:00, 2010-10-20 00:00:00, 2011-01-25 00:00:00], dtype=object)
# Works too:
>>> conversion = lambda x: numpy.datetime64("%s-%s-%s" % (x/10000, x/100%100, x%100))
Weird how, in this case, datetime64
works fine with or without the dashes...
Oddly, this works: numpy.datetime64(a.astype("S8").tolist())
, while this does not: numpy.datetime64(a.astype("S8"))
. The first method is still a bit less convoluted than: numpy.array([numpy.datetime64(str(i)) for i in a])
. I asked why in this question.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With