I'm a complete rookie to Python, but it seems like a given string is able to be (effectively) arbitrary length. i.e. you can take a string str
and keeping adding to it: str += "some stuff..."
. Is there a way to make an array of such strings?
When I try this, each element only stores a single character
strArr = numpy.empty(10, dtype='string')
for i in range(0,10)
strArr[i] = "test"
On the other hand, I know I can initialize an array of certain length strings, i.e.
strArr = numpy.empty(10, dtype='s256')
which can store 10 strings of up to 256 characters.
The elements of a NumPy array, or simply an array, are usually numbers, but can also be boolians, strings, or other objects.
You can get the number of dimensions, shape (length of each dimension), and size (number of all elements) of the NumPy array with ndim , shape , and size attributes of numpy. ndarray . The built-in function len() returns the size of the first dimension.
Size of a numpy array can be changed by using resize() function of the NumPy library. refcheck- It is a boolean that checks the reference count.
You can do so by creating an array of dtype=object
. If you try to assign a long string to a normal numpy array, it truncates the string:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'])
>>> a[2] = 'bananas'
>>> a
array(['apples', 'foobar', 'banana'],
dtype='|S6')
But when you use dtype=object
, you get an array of python object references. So you can have all the behaviors of python strings:
>>> a = numpy.array(['apples', 'foobar', 'cowboy'], dtype=object)
>>> a
array([apples, foobar, cowboy], dtype=object)
>>> a[2] = 'bananas'
>>> a
array([apples, foobar, bananas], dtype=object)
Indeed, because it's an array of objects, you can assign any kind of python object to the array:
>>> a[2] = {1:2, 3:4}
>>> a
array([apples, foobar, {1: 2, 3: 4}], dtype=object)
However, this undoes a lot of the benefits of using numpy, which is so fast because it works on large contiguous blocks of raw memory. Working with python objects adds a lot of overhead. A simple example:
>>> a = numpy.array(['abba' for _ in range(10000)])
>>> b = numpy.array(['abba' for _ in range(10000)], dtype=object)
>>> %timeit a.copy()
100000 loops, best of 3: 2.51 us per loop
>>> %timeit b.copy()
10000 loops, best of 3: 48.4 us per loop
You could use the object data type:
>>> import numpy
>>> s = numpy.array(['a', 'b', 'dude'], dtype='object')
>>> s[0] += 'bcdef'
>>> s
array([abcdef, b, dude], dtype=object)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With