I used: <pre class="prettyprint"><code>df['ids'] = df['ids'].values.astype(set) </code></pre> to turn <code>lists</code> into <code>sets</code>, but the output was a list not a set: <pre class="prettyprint"><code>>>> x = np.array([[1, 2, 2.5],[12,35,12]]) >>> x.astype(set) array([[1.0, 2.0, 2.5], [12.0, 35.0, 12.0]], dtype=object) </code></pre> Is there an efficient way to turn list into set in <code>Numpy</code>? EDIT 1: My input is as big as below: I have 3,000 records. Each has 30,000 ids: [[1,...,12,13,...,30000], [1,..,43,45,...,30000],...,[...]]

First flatten your ndarray to obtain a single dimensional array, then apply set() on it: <pre class="prettyprint"><code>set(x.flatten()) </code></pre> Edit : since it seems you just want an array of set, not a set of the whole array, then you can do <code>value = [set(v) for v in x]</code> to obtain a list of sets.

A couple of earlier 'row-wise' unique questions: vectorize numpy unique for subarrays Numpy: Row Wise Unique elements Count unique elements row wise in an ndarray In a couple of these the count is more interesting than the actual unique values. If the number of unique values per row differs, then the result cannot be a (2d) array. That's a pretty good indication that the problem cannot be fully vectorized. You need some sort of iteration over the rows.

How to turn Numpy array to set efficiently?

Tags:

python

set

numpy

I used:

df['ids'] = df['ids'].values.astype(set)

to turn lists into sets, but the output was a list not a set:

>>> x = np.array([[1, 2, 2.5],[12,35,12]])

>>> x.astype(set)
array([[1.0, 2.0, 2.5],
       [12.0, 35.0, 12.0]], dtype=object)

Is there an efficient way to turn list into set in Numpy?

EDIT 1:
My input is as big as below:
I have 3,000 records. Each has 30,000 ids: [[1,...,12,13,...,30000], [1,..,43,45,...,30000],...,[...]]

691

asked Oct 18 '15 08:10

Alireza

2 Answers

First flatten your ndarray to obtain a single dimensional array, then apply set() on it:

set(x.flatten())

Edit : since it seems you just want an array of set, not a set of the whole array, then you can do value = [set(v) for v in x] to obtain a list of sets.

200

answered Sep 22 '22 19:09

P. Camilleri

A couple of earlier 'row-wise' unique questions:

vectorize numpy unique for subarrays

Numpy: Row Wise Unique elements

Count unique elements row wise in an ndarray

In a couple of these the count is more interesting than the actual unique values.

If the number of unique values per row differs, then the result cannot be a (2d) array. That's a pretty good indication that the problem cannot be fully vectorized. You need some sort of iteration over the rows.

answered Sep 22 '22 19:09

hpaulj

Related questions
                            
                                Mask a circular sector in a numpy array
                            
                                Proximity Matrix in sklearn.ensemble.RandomForestClassifier
                            
                                how to plot arbitrary markers on a pandas data series?
                            
                                Django: Check for related objects and whether it contains data
                            
                                What does base value do in int function?
                            
                                How to sort integer list in python descending order
                            
                                Create superuser Django in PyCharm
                            
                                How to return indices of values between two numbers in numpy array
                            
                                Turning off Tick Marks in Bokeh
                            
                                Fit multivariate gaussian distribution to a given dataset
                            
                                How to close socket connection on Ctrl-C in a python programme
                            
                                TypeError: __init__() takes 1 positional argument but 3 were given
                            
                                REQUESTS: Return file object from url (as with open('','rb') )
                            
                                Fill zero values of 1d numpy array with last non-zero values
                            
                                Getting statsmodels to use heteroskedasticity corrected standard errors in coefficient t-tests
                            
                                how to manage sys.path globally in pycharm
                            
                                Stopping list selection? [duplicate]
                            
                                Convert dataframe date row to a weekend / not weekend value
                            
                                Python - Generating the plural noun of a singular noun
                            
                                Adding years in python

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With