The following snippet creates a "typical test array", the purpose of this array is to test an assortment of things in my program. Is there a way or is it even possible to change the type of elements in an array?
import numpy as np
import random
from random import uniform, randrange, choice
# ... bunch of silly code ...
def gen_test_array( ua, low_inc, med_inc, num_of_vectors ):
#typical_array = [ zone_id, ua, inc, veh, pop, hh, with_se, is_cbd, re, se=0, oe]
typical_array = np.zeros( shape = ( num_of_vectors, 11 ) )
for i in range( 0, num_of_vectors ):
typical_array[i] = [i, int( ua ), uniform( low_inc / 2, med_inc * 2 ), uniform( 0, 6 ),
randrange( 100, 5000 ), randrange( 100, 500 ),
choice( [True, False] ), choice( [True, False] ),
randrange( 100, 5000 ), randrange( 100, 5000 ),
randrange( 100, 5000 ) ]
return typical_array
The way to do this in numpy is to use a structured array.
However, in many cases where you're using heterogeneous data, a simple python list is a much better choice. (Or, though it wasn't widely available when this answer was written, a pandas.DataFrame
is absolutely ideal for this scenario.)
Regardless, the example you gave above will work perfectly as a "normal" numpy array. You can just make everything a float in the example you gave. (Everything appears to be an int, except for two columns of floats... The bools can easily be represented as ints.)
Nonetheless, to illustrate using structured dtypes...
import numpy as np
ua = 5 # No idea what "ua" is in your code above...
low_inc, med_inc = 0.5, 2.0 # Again, no idea what these are...
num = 100
num_fields = 11
# Use more descriptive names than "col1"! I'm just generating the names as placeholders
dtype = {'names':['col%i'%i for i in range(num_fields)],
'formats':2*[np.int] + 2*[np.float] + 2*[np.int] + 2*[np.bool] + 3*[np.int]}
data = np.zeros(num, dtype=dtype)
# Being rather verbose...
data['col0'] = np.arange(num, dtype=np.int)
data['col1'] = int(ua) * np.ones(num)
data['col2'] = np.random.uniform(low_inc / 2, med_inc * 2, num)
data['col3'] = np.random.uniform(0, 6, num)
data['col4'] = np.random.randint(100, 5000, num)
data['col5'] = np.random.randint(100, 500, num)
data['col6'] = np.random.randint(0, 2, num).astype(np.bool)
data['col7'] = np.random.randint(0, 2, num).astype(np.bool)
data['col8'] = np.random.randint(100, 5000, num)
data['col9'] = np.random.randint(100, 5000, num)
data['col10'] = np.random.randint(100, 5000, num)
print data
Which yields a 100-element array with 11 fields:
array([ (0, 5, 2.0886534380436226, 3.0111285613794276, 3476, 117, False, False, 4704, 4372, 4062),
(1, 5, 2.0977199579338115, 1.8687472941590277, 4635, 496, True, False, 4079, 4263, 3196),
...
...
(98, 5, 1.1682309811443277, 1.4100766819689299, 1213, 135, False, False, 1250, 2534, 1160),
(99, 5, 1.746554619056416, 5.210411489007637, 1387, 352, False, False, 3520, 3772, 3249)],
dtype=[('col0', '<i8'), ('col1', '<i8'), ('col2', '<f8'), ('col3', '<f8'), ('col4', '<i8'), ('col5', '<i8'), ('col6', '|b1'), ('col7', '|b1'), ('col8', '<i8'), ('col9', '<i8'), ('col10', '<i8')])
Quoting the first line of chapter 1 of the NumPy reference:
NumPy provides an N-dimensional array type, the ndarray, which describes a collection of “items” of the same type.
So every member of the array has to be of the same type. The loss of generality here, as compared to regular Python lists, is the trade-off that allows high speed operations on arrays: loops can run without testing the type of each member.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With