I'm translating some stuff from MATLAB to the Python language.
There's this command, unique(a), in NumPy. But since the MATLAB program runs the 'rows' command also, it gives something a little different.
Is there a similar command in Python or should I make some algorithm that does the same thing?
drop_duplicates(df) to select only unique rows from pandas. DataFrame . To select unique rows over certain columns, use DataFrame. drop_duplicate(subset = None) with subset assigned to a list of columns to get unique rows over these columns.
C = unique( A , setOrder ) returns the unique values of A in a specific order. setOrder can be 'sorted' (default) or 'stable' . C = unique( A , occurrence ) specifies which indices to return in case of repeated values. occurrence can be 'first' (default) or 'last' .
To find unique rows in a NumPy array we are using numpy. unique() function of NumPy library.
Assuming your 2D array is stored in the usual C order (that is, each row is counted as an array or list within the main array; in other words, row-major order), or that you transpose the array beforehand otherwise, you could do something like...
>>> import numpy as np
>>> a = np.array([[1, 2, 3], [2, 3, 4], [1, 2, 3], [3, 4, 5]])
>>> a
array([[1, 2, 3],
[2, 3, 4],
[1, 2, 3],
[3, 4, 5]])
>>> np.array([np.array(x) for x in set(tuple(x) for x in a)]) # or "list(x) for x in set[...]"
array([[3, 4, 5],
[2, 3, 4],
[1, 2, 3]])
Of course, this doesn't really work if you need the unique rows in their original order.
By the way, to emulate something like unique(a, 'columns')
, you'd just transpose the original array, do the step shown above, and then transpose back.
You can try:
ii = 0; wrk_arr = your_arr
idx = numpy.arange(0,len(wrk_arr))
while ii<=len(wrk_arr)-1:
i_list = numpy.arange(0,len(wrk_arr)
candidate = numpy.matrix(wrk_arr[ii,:])
i_dup = numpy.array([0] * len(wrk_arr))
numpy.all(candidate == wrk_arr,axis=1, iout = idup)
idup[ii]=0
i_list = numpy.unique(i_list * (1-idup))
idx = numpy.unique(idx * (1-idup))
wrk_arr = wrk_arr[i_list,:]
ii += 1
The results are wrk_arr which is the unique sorted array of your_arr. The relation is:
your_arr[idx,:] = wrk_arr
It works like MATLAB in the sense that the returned array (wrk_arr) keeps the order of the original array (your_arr). The idx array differs from MATLAB since it contains the indices of first appearance whereas MATLAB returns the LAST appearance.
From my experience it worked as fast as MATLAB on a 10000 X 4 matrix.
And a transpose will do the trick for the column case.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With