I have an array e.g. <pre class="prettyprint"><code>Array = [[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5],[1,1,1],[2,2,2]] </code></pre> And i would like something that would output the following: <pre class="prettyprint"><code>Repeated = [[1,1,1],[2,2,2]] </code></pre> Preserving the number of repeated rows would work too, e.g. <pre class="prettyprint"><code>Repeated = [[1,1,1],[1,1,1],[2,2,2],[2,2,2]] </code></pre> I thought the solution might include numpy.unique, but i can't get it to work, is there a native python / numpy function?

Using the new <code>axis</code> functionality of <code>np.unique</code> alongwith <code>return_counts=True</code> that gives us the unique rows and the corresponding counts for each of those rows, we can mask out the rows with <code>counts > 1</code> and thus have our desired output, like so - <pre class="prettyprint"><code>In [688]: a = np.array([[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5],[1,1,1],[2,2,2]]) In [689]: unq, count = np.unique(a, axis=0, return_counts=True) In [690]: unq[count>1] Out[690]: array([[1, 1, 1], [2, 2, 2]]) </code></pre>

how do you find and save duplicated rows in a numpy array?

I have an array e.g.

Array = [[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5],[1,1,1],[2,2,2]]

And i would like something that would output the following:

Repeated = [[1,1,1],[2,2,2]]

Preserving the number of repeated rows would work too, e.g.

Repeated = [[1,1,1],[1,1,1],[2,2,2],[2,2,2]]

I thought the solution might include numpy.unique, but i can't get it to work, is there a native python / numpy function?

How do you delete duplicate rows in NumPy array?

The unique() method is a built-in method in the numpy, that takes an array as input and return a unique array i.e by removing all the duplicate elements. In order to remove duplicates we will pass the given NumPy array to the unique() method and it will return the unique array.

Which of the following is a possible way to find unique rows in a NumPy array?

To find unique rows in a NumPy array we are using numpy. unique() function of NumPy library.

How do you repeat rows in NumPy?

In Python, if you want to repeat the elements multiple times in the NumPy array then you can use the numpy. repeat() function. In Python, this method is available in the NumPy module and this function is used to return the numpy array of the repeated items along with axis such as 0 and 1.

Using the new axis functionality of np.unique alongwith return_counts=True that gives us the unique rows and the corresponding counts for each of those rows, we can mask out the rows with counts > 1 and thus have our desired output, like so -

In [688]: a = np.array([[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5],[1,1,1],[2,2,2]])

In [689]: unq, count = np.unique(a, axis=0, return_counts=True)

In [690]: unq[count>1]
Out[690]: 
array([[1, 1, 1],
       [2, 2, 2]])

If you need to get indices of the repeated rows

import numpy as np

a = np.array([[1,1,1],[2,2,2],[3,3,3],[4,4,4],[5,5,5],[1,1,1],[2,2,2]])
unq, count = np.unique(a, axis=0, return_counts=True)
repeated_groups = unq[count > 1]

for repeated_group in repeated_groups:
    repeated_idx = np.argwhere(np.all(a == repeated_group, axis=1))
    print(repeated_idx.ravel())

# [0 5]
# [1 6]

how do you find and save duplicated rows in a numpy array?

Tags:

python

numpy

rows

Ben Bird

People also ask

2 Answers

Divakar

v.grabovets

Recent Activity

Donate For Us

how do you find and save duplicated rows in a numpy array?

Tags:

python

numpy

rows

Ben Bird

People also ask

2 Answers

Divakar

v.grabovets

Related questions

Recent Activity

Donate For Us