Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Creating a numpy array from a set

I noticed the following behaviour exhibited by numpy arrays:

>>> import numpy as np
>>> s = {1,2,3}
>>> l = [1,2,3]
>>> np.array(l)
array([1, 2, 3])
>>> np.array(s)
array({1, 2, 3}, dtype=object)
>>> np.array(l, dtype='int')
array([1, 2, 3])
>>> np.array(l, dtype='int').dtype
dtype('int64')
>>> np.array(s, dtype='int')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: int() argument must be a string, a bytes-like object or a number, not 'set'

There are 2 things to notice:

  1. Creating an array from a set results in the array dtype being object
  2. Trying to specify dtype results in an error which suggests that the set is being treated as a single element rather than an iterable.

What am I missing - I don't fully understand which bit of python I'm overlooking. Set is a mutable object much like a list is.

EDIT: tuples work fine:

>>> t = (1,2,3)
>>> np.array(t)
array([1, 2, 3])
>>> np.array(t).dtype
dtype('int64')
like image 759
s5s Avatar asked Oct 29 '25 01:10

s5s


1 Answers

The array factory works best with sequence objects which a set is not. If you do not care about the order of elements and know they are all ints or convertible to int, then you can use np.fromiter

np.fromiter({1,2,3},int,3)
# array([1, 2, 3])

The second (dtype) argument is mandatory; the last (count) argument is optional, providing it can improve performance.

like image 149
Paul Panzer Avatar answered Oct 31 '25 17:10

Paul Panzer