I am looking for a way to implement an "as of" operator in numpy
. Specifically, if:
t1
is an n
-vector of timestamps in a strictly increasing order;d1
is an n x p
matrix of observations, with i
-th row corresponding to t1[i]
;t2
in an m
-vector of timestamps, also in a strictly increasing order;I need to create an m x p
matrix d2
, where d2[i]
is simply d1[j]
for the largest value of j
such that t1[j] <= t2[i]
.
In other words, I need to get the rows of d1
as of the timestamps in t2
.
It is easy to write this in pure Python, but I am wondering if there's a way to avoid having interpreted loops (n
, m
and p
are quite large).
The timestamps are datetime.datetime
objects. The observations are floating-point values.
edit: For entries where t1[j] <= t2[i]
can't be satisfied (i.e. where a timestamp in t2
precedes all timestamps in t1
), I would ideally like to get rows of NaN
s.
Fancy indexing is conceptually simple: it means passing an array of indices to access multiple array elements at once. For example, consider the following array: import numpy as np rand = np. random. RandomState(42) x = rand.
NumPy Arrays are faster than Python Lists because of the following reasons: An array is a collection of homogeneous data-types that are stored in contiguous memory locations. On the other hand, a list in Python is a collection of heterogeneous data types stored in non-contiguous memory locations.
Tensorflow is consistently much slower than Numpy in my tests.
If an array has shape (n,) , that means it's a 1-dimensional array with a length of n along its only dimension. It's not a row vector or a column vector; it doesn't have rows or columns.
Your best choice is numpy.searchsorted()
:
d1[numpy.searchsorted(t1, t2, side="right") - 1]
This will search the indices where the values of t2
would have to be inserted into t1
to maintain order. The side="right"
and - 1
bits are to ensure exactly the specified behaviour.
Edit: To get rows of NaNs where the condition t1[j] <= t2[i]
can't be satisfied, you could use
nan_row = numpy.repeat(numpy.nan, d1.shape[1])
d1_nan = numpy.vstack((nan_row, d1))
d2 = d1_nan[numpy.searchsorted(t1, t2, side="right")]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With