I have 3D numpy array and I want only unique 2D-sub-arrays.
Input:
[[[ 1 2]
[ 3 4]]
[[ 5 6]
[ 7 8]]
[[ 9 10]
[11 12]]
[[ 5 6]
[ 7 8]]]
Output:
[[[ 1 2]
[ 3 4]]
[[ 5 6]
[ 7 8]]
[[ 9 10]
[11 12]]]
I tried convert sub-arrays to string (tostring() method) and then use np.unique, but after transform to numpy array, it deleted last bytes of \x00, so I can't transform it back with np.fromstring().
Example:
import numpy as np
a = np.array([[[1,2],[3,4]],[[5,6],[7,8]],[[9,10],[11,12]],[[5,6],[7,8]]])
b = [x.tostring() for x in a]
print(b)
c = np.array(b)
print(c)
print(np.array([np.fromstring(x) for x in c]))
Output:
[b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00', b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00']
[b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04'
b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08'
b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c'
b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08']
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-86-6772b096689f> in <module>()
5 c = np.array(b)
6 print(c)
----> 7 print(np.array([np.fromstring(x) for x in c]))
<ipython-input-86-6772b096689f> in <listcomp>(.0)
5 c = np.array(b)
6 print(c)
----> 7 print(np.array([np.fromstring(x) for x in c]))
ValueError: string size must be a multiple of element size
I also tried view, but I realy don't know how to use it. Can you help me please?
With the help of np. unique() method, we can get the unique values from an array given as parameter in np. unique() method.
unique() function. The unique() function is used to find the unique elements of an array. Returns the sorted unique elements of an array.
To find unique rows in a NumPy array we are using numpy. unique() function of NumPy library.
Using @Jaime's post
, to solve our case of finding unique 2D subarrays, I came up with this solution that basically adds a reshaping to the view
step -
def unique2D_subarray(a):
dtype1 = np.dtype((np.void, a.dtype.itemsize * np.prod(a.shape[1:])))
b = np.ascontiguousarray(a.reshape(a.shape[0],-1)).view(dtype1)
return a[np.unique(b, return_index=1)[1]]
Sample run -
In [62]: a
Out[62]:
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]],
[[ 5, 6],
[ 7, 8]]])
In [63]: unique2D_subarray(a)
Out[63]:
array([[[ 1, 2],
[ 3, 4]],
[[ 5, 6],
[ 7, 8]],
[[ 9, 10],
[11, 12]]])
The numpy_indexed package (disclaimer: I am its author) is designed to do operations such as these in an efficient and vectorized manner:
import numpy_indexed as npi
npi.unique(a)
One solution would be to use a set to keep track of which sub arrays you have seen:
seen = set([])
new_a = []
for j in a:
f = tuple(list(j.flatten()))
if f not in seen:
new_a.append(j)
seen.add(f)
print np.array(new_a)
Or using numpy only:
print np.unique(a).reshape((len(unique) / 4, 2, 2))
>>> [[[ 1 2]
[ 3 4]]
[[ 5 6]
[ 7 8]]
[[ 9 10]
[11 12]]]
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With