How can I remove duplicate rows of a 2 dimensional numpy
array?
data = np.array([[1,8,3,3,4], [1,8,9,9,4], [1,8,3,3,4]])
The answer should be as follows:
ans = array([[1,8,3,3,4], [1,8,9,9,4]])
If there are two rows that are the same, then I would like to remove one "duplicate" row.
To remove duplicates from an array: First, convert an array of duplicates to a Set . The new Set will implicitly remove duplicate elements. Then, convert the set back to an array.
You can set 'keep=False' in the drop_duplicates() function to remove all the duplicate rows.
You can use numpy unique
. Since you want the unique rows, we need to put them into tuples:
import numpy as np data = np.array([[1,8,3,3,4], [1,8,9,9,4], [1,8,3,3,4]])
just applying np.unique
to the data
array will result in this:
>>> uniques array([1, 3, 4, 8, 9])
prints out the unique elements in the list. So putting them into tuples results in:
new_array = [tuple(row) for row in data] uniques = np.unique(new_array)
which prints:
>>> uniques array([[1, 8, 3, 3, 4], [1, 8, 9, 9, 4]])
UPDATE
In the new version, you need to set np.unique(data, axis=0)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With