Numpy unique 2D sub-array [duplicate]

Tags:

I have 3D numpy array and I want only unique 2D-sub-arrays.

Input:

[[[ 1  2]
  [ 3  4]]

 [[ 5  6]
  [ 7  8]]

 [[ 9 10]
  [11 12]]

 [[ 5  6]
  [ 7  8]]]

Output:

Click to copy

[[[ 1  2]
  [ 3  4]]

 [[ 5  6]
  [ 7  8]]

 [[ 9 10]
  [11 12]]]

I tried convert sub-arrays to string (tostring() method) and then use np.unique, but after transform to numpy array, it deleted last bytes of \x00, so I can't transform it back with np.fromstring().

Example:

Click to copy

import numpy as np
a = np.array([[[1,2],[3,4]],[[5,6],[7,8]],[[9,10],[11,12]],[[5,6],[7,8]]])
b = [x.tostring() for x in a]
print(b)
c = np.array(b)
print(c)
print(np.array([np.fromstring(x) for x in c]))

Output:

Click to copy

[b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00', b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c\x00\x00\x00', b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08\x00\x00\x00']
[b'\x01\x00\x00\x00\x02\x00\x00\x00\x03\x00\x00\x00\x04'
 b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08'
 b'\t\x00\x00\x00\n\x00\x00\x00\x0b\x00\x00\x00\x0c'
 b'\x05\x00\x00\x00\x06\x00\x00\x00\x07\x00\x00\x00\x08']

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-86-6772b096689f> in <module>()
      5 c = np.array(b)
      6 print(c)
----> 7 print(np.array([np.fromstring(x) for x in c]))

<ipython-input-86-6772b096689f> in <listcomp>(.0)
      5 c = np.array(b)
      6 print(c)
----> 7 print(np.array([np.fromstring(x) for x in c]))

ValueError: string size must be a multiple of element size

I also tried view, but I realy don't know how to use it. Can you help me please?

762

asked Nov 18 '16 10:11

Peťan

3 Answers

Using @Jaime's post, to solve our case of finding unique 2D subarrays, I came up with this solution that basically adds a reshaping to the view step -

Click to copy

def unique2D_subarray(a):
    dtype1 = np.dtype((np.void, a.dtype.itemsize * np.prod(a.shape[1:])))
    b = np.ascontiguousarray(a.reshape(a.shape[0],-1)).view(dtype1)
    return a[np.unique(b, return_index=1)[1]]

Sample run -

Click to copy

In [62]: a
Out[62]: 
array([[[ 1,  2],
        [ 3,  4]],

       [[ 5,  6],
        [ 7,  8]],

       [[ 9, 10],
        [11, 12]],

       [[ 5,  6],
        [ 7,  8]]])

In [63]: unique2D_subarray(a)
Out[63]: 
array([[[ 1,  2],
        [ 3,  4]],

       [[ 5,  6],
        [ 7,  8]],

       [[ 9, 10],
        [11, 12]]])

129

answered Oct 07 '22 10:10

Divakar

The numpy_indexed package (disclaimer: I am its author) is designed to do operations such as these in an efficient and vectorized manner:

Click to copy

import numpy_indexed as npi
npi.unique(a)

answered Oct 07 '22 09:10

Eelco Hoogendoorn

One solution would be to use a set to keep track of which sub arrays you have seen:

Click to copy

seen = set([])
new_a = []

for j in a:
    f = tuple(list(j.flatten()))
    if f not in seen:
        new_a.append(j)
        seen.add(f)

print np.array(new_a)

Or using numpy only:

Click to copy

print np.unique(a).reshape((len(unique) / 4, 2, 2))

>>> [[[ 1  2]
      [ 3  4]]

     [[ 5  6]
      [ 7  8]]

     [[ 9 10]
      [11 12]]]

answered Oct 07 '22 11:10

kezzos

Related questions
                            
                                Absolute Import Not Working, But Relative Import Does
                            
                                Call a C++ function from Python and convert a OpenCV Mat to a Numpy array
                            
                                Issues with Python pandas: read_html and python3-lxml installation
                            
                                Pandas plot hist sharex=False does not behave as expected
                            
                                Parallelize pandas apply
                            
                                TypeError: Invalid argument(s) 'pool_size' sent to create_engine() when using flask_sqlalchemy
                            
                                how do I catch multiple error types [duplicate]
                            
                                Python: exec() a code block and eval() the last line
                            
                                Create gantt chart with hlines?
                            
                                numpy.meshgrid explanation
                            
                                Sending data with kafka-python only working when briefly delaying code
                            
                                Dask "no module named xxxx" error
                            
                                Python-like multiprocessing in C++
                            
                                Using Boto3 in python to acquire results from dynamodb and parse into a usable variable or dictionary
                            
                                Maximum of an annotation after a group by
                            
                                Numpy roll vertical in 2d array
                            
                                How to select specific the cipher while sending request via python request module
                            
                                Python-Sphinx: "inherit" method documentation from superclass
                            
                                How to run django and wordpress on NGINX server using same domain?
                            
                                How to unpack a dictionary of list (of dictionaries!) and return as grouped tuples?

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Numpy unique 2D sub-array [duplicate]

Tags:

python

unique

numpy

sub-array

Peťan

People also ask

3 Answers

Divakar

Eelco Hoogendoorn

kezzos

Recent Activity

Donate For Us