Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Numpy converting array from float to strings

Tags:

I have an array of floats that I have normalised to one (i.e. the largest number in the array is 1), and I wanted to use it as colour indices for a graph. In using matplotlib to use grayscale, this requires using strings between 0 and 1, so I wanted to convert the array of floats to an array of strings. I was attempting to do this by using "astype('str')", but this appears to create some values that are not the same (or even close) to the originals.

I notice this because matplotlib complains about finding the number 8 in the array, which is odd as it was normalised to one!

In short, I have an array phis, of float64, such that:

numpy.where(phis.astype('str').astype('float64') != phis) 

is non empty. This is puzzling as (hopefully naively) it appears to be a bug in numpy, is there anything that I could have done wrong to cause this?

Edit: after investigation this appears to be due to the way the string function handles high precision floats. Using a vectorized toString function (as from robbles answer), this is also the case, however if the lambda function is:

lambda x: "%.2f" % x 

Then the graphing works - curiouser and curiouser. (Obviously the arrays are no longer equal however!)

like image 625
V.S. Avatar asked Mar 19 '11 23:03

V.S.


People also ask

How do you convert an array of strings to an array of floats in Python?

How do you convert an array of strings to an array of floats in Python? Use numpy. ndarray. astype() to convert a NumPy array of strings to an array of floats.

How do I convert a float to a string in Python?

In Python, we can use str() to convert float to String.

Can NumPy array store strings?

The elements of a NumPy array, or simply an array, are usually numbers, but can also be boolians, strings, or other objects.


2 Answers

You seem a bit confused as to how numpy arrays work behind the scenes. Each item in an array must be the same size.

The string representation of a float doesn't work this way. For example, repr(1.3) yields '1.3', but repr(1.33) yields '1.3300000000000001'.

A accurate string representation of a floating point number produces a variable length string.

Because numpy arrays consist of elements that are all the same size, numpy requires you to specify the length of the strings within the array when you're using string arrays.

If you use x.astype('str'), it will always convert things to an array of strings of length 1.

For example, using x = np.array(1.344566), x.astype('str') yields '1'!

You need to be more explict and use the '|Sx' dtype syntax, where x is the length of the string for each element of the array.

For example, use x.astype('|S10') to convert the array to strings of length 10.

Even better, just avoid using numpy arrays of strings altogether. It's usually a bad idea, and there's no reason I can see from your description of your problem to use them in the first place...

like image 121
Joe Kington Avatar answered Sep 17 '22 12:09

Joe Kington


If you have an array of numbers and you want an array of strings, you can write:

strings = ["%.2f" % number for number in numbers] 

If your numbers are floats, the array would be an array with the same numbers as strings with two decimals.

>>> a = [1,2,3,4,5] >>> min_a, max_a = min(a), max(a) >>> a_normalized = [float(x-min_a)/(max_a-min_a) for x in a] >>> a_normalized [0.0, 0.25, 0.5, 0.75, 1.0] >>> a_strings = ["%.2f" % x for x in a_normalized] >>> a_strings ['0.00', '0.25', '0.50', '0.75', '1.00'] 

Notice that it also works with numpy arrays:

>>> a = numpy.array([0.0, 0.25, 0.75, 1.0]) >>> print ["%.2f" % x for x in a] ['0.00', '0.25', '0.50', '0.75', '1.00'] 

A similar methodology can be used if you have a multi-dimensional array:

new_array = numpy.array(["%.2f" % x for x in old_array.reshape(old_array.size)]) new_array = new_array.reshape(old_array.shape) 

Example:

>>> x = numpy.array([[0,0.1,0.2],[0.3,0.4,0.5],[0.6, 0.7, 0.8]]) >>> y = numpy.array(["%.2f" % w for w in x.reshape(x.size)]) >>> y = y.reshape(x.shape) >>> print y [['0.00' '0.10' '0.20']  ['0.30' '0.40' '0.50']  ['0.60' '0.70' '0.80']] 

If you check the Matplotlib example for the function you are using, you will notice they use a similar methodology: build empty matrix and fill it with strings built with the interpolation method. The relevant part of the referenced code is:

colortuple = ('y', 'b') colors = np.empty(X.shape, dtype=str) for y in range(ylen):     for x in range(xlen):         colors[x, y] = colortuple[(x + y) % len(colortuple)]  surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, facecolors=colors,         linewidth=0, antialiased=False) 
like image 20
Escualo Avatar answered Sep 20 '22 12:09

Escualo