I have a 2D numpy array S representing a state space, with 80000000 rows (as states) and 5 columns (as state variables).
I initialize K0 with S, and at each iteration, I apply a state transition function f(x) on all of the states in Ki, and delete states whose f(x) is not in Ki, resulting Ki+1. Until it converges i.e. Ki+1 = Ki.
Going like this would take ages:
K = S
to_delete = [0]
While to_delete:
to_delete = []
for i in xrange(len(K)):
if not f(i) in K:
to_delete.append(K(i))
K = delete(K,to_delete,0)
So I wanted to make a vectorized implementation :
slice K in columns, apply f and, join them once again, thus obtaining f(K) somehow.
The question now is how to get an array of length len(K), say Sel, where each row Sel[i] determine whether f(K[i]) is in K. Exactly like the function in1d works.
Then it would be simple to make
K=K[Sel]]
Your question is difficult to understand because it contains extraneous information and contains typos. If I understand correctly, you simply want an efficient way to perform a set operation on the rows of a 2D array (in this case the intersection of the rows of K
and f(K)
).
You can do this with numpy.in1d if you create structured array view.
Code:
if this is K
:
In [50]: k
Out[50]:
array([[6, 6],
[3, 7],
[7, 5],
[7, 3],
[1, 3],
[1, 5],
[7, 6],
[3, 8],
[6, 1],
[6, 0]])
and this is f(K)
(for this example I subtract 1 from the first col and add 1 to the second):
In [51]: k2
Out[51]:
array([[5, 7],
[2, 8],
[6, 6],
[6, 4],
[0, 4],
[0, 6],
[6, 7],
[2, 9],
[5, 2],
[5, 1]])
then you can find all rows in K
also found in f(K)
by doing something this:
In [55]: k[np.in1d(k.view(dtype='i,i').reshape(k.shape[0]),k2.view(dtype='i,i').
reshape(k2.shape[0]))]
Out[55]: array([[6, 6]])
view
and reshape
create flat structured views so that each row appears as a single element to in1d
. in1d
creates a boolean index of k
of matched items which is used to fancy index k
and return the filtered array.
Not sure if I understand your question entirely, but if the interpretation of Paul is correct, it can be solved efficiently and fully vectorized using the numpy_indexed package as such in a single readable line:
import numpy_indexed as npi
K = npi.intersection(K, f(K))
Also, this works for rows of any type or shape.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With