I have a 2d numpy.array, where the first column contains datetime.datetime objects, and the second column integers:
A = array([[2002-03-14 19:57:38, 197],
[2002-03-17 16:31:33, 237],
[2002-03-17 16:47:18, 238],
[2002-03-17 18:29:31, 239],
[2002-03-17 20:10:11, 240],
[2002-03-18 16:18:08, 252],
[2002-03-23 23:44:38, 327],
[2002-03-24 09:52:26, 334],
[2002-03-25 16:04:21, 352],
[2002-03-25 18:53:48, 353]], dtype=object)
What I would like to do is select all rows for a specific date, something like
A[first_column.date()==datetime.date(2002,3,17)]
array([[2002-03-17 16:31:33, 237],
[2002-03-17 16:47:18, 238],
[2002-03-17 18:29:31, 239],
[2002-03-17 20:10:11, 240]], dtype=object)
How can I achieve this?
Thanks for your insight :)
In NumPy, you filter an array using a boolean index list. A boolean index list is a list of booleans corresponding to indexes in the array. If the value at an index is True that element is contained in the filtered array, if the value at that index is False that element is excluded from the filtered array.
datetime64() method, we can get the date in a numpy array in a particular format i.e year-month-day by using numpy. datetime64() method. Syntax : numpy.datetime64(date) Return : Return the date in a format 'yyyy-mm-dd'.
You could do this:
from_date=datetime.datetime(2002,3,17,0,0,0)
to_date=from_date+datetime.timedelta(days=1)
idx=(A[:,0]>from_date) & (A[:,0]<=to_date)
print(A[idx])
# array([[2002-03-17 16:31:33, 237],
# [2002-03-17 16:47:18, 238],
# [2002-03-17 18:29:31, 239],
# [2002-03-17 20:10:11, 240]], dtype=object)
A[:,0]
is the first column of A
.
Unfortunately, comparing A[:,0]
with a datetime.date
object raises a TypeError. However, comparison with a datetime.datetime
object works:
In [63]: A[:,0]>datetime.datetime(2002,3,17,0,0,0)
Out[63]: array([False, True, True, True, True, True, True, True, True, True], dtype=bool)
Also, unfortunately,
datetime.datetime(2002,3,17,0,0,0)<A[:,0]<=datetime.datetime(2002,3,18,0,0,0)
raises a TypeError too, since this calls datetime.datetime
's __lt__
method instead of the numpy array's __lt__
method. Perhaps this is a bug.
Anyway, it's not hard to work-around; you can say
In [69]: (A[:,0]>datetime.datetime(2002,3,17,0,0,0)) & (A[:,0]<=datetime.datetime(2002,3,18,0,0,0))
Out[69]: array([False, True, True, True, True, False, False, False, False, False], dtype=bool)
Since this gives you a boolean array, you can use it as a "fancy index" to A
, which yields the desired result.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With