I have an array of floats that I have normalised to one (i.e. the largest number in the array is 1), and I wanted to use it as colour indices for a graph. In using matplotlib to use grayscale, this requires using strings between 0 and 1, so I wanted to convert the array of floats to an array of strings. I was attempting to do this by using "astype('str')", but this appears to create some values that are not the same (or even close) to the originals. I notice this because matplotlib complains about finding the number 8 in the array, which is odd as it was normalised to one! In short, I have an array phis, of float64, such that: <pre class="prettyprint"><code>numpy.where(phis.astype('str').astype('float64') != phis) </code></pre> is non empty. This is puzzling as (hopefully naively) it appears to be a bug in numpy, is there anything that I could have done wrong to cause this? Edit: after investigation this appears to be due to the way the string function handles high precision floats. Using a vectorized toString function (as from robbles answer), this is also the case, however if the lambda function is: <pre class="prettyprint"><code>lambda x: "%.2f" % x </code></pre> Then the graphing works - curiouser and curiouser. (Obviously the arrays are no longer equal however!)

You seem a bit confused as to how numpy arrays work behind the scenes. Each item in an array must be the same size. The string representation of a float doesn't work this way. For example, <code>repr(1.3)</code> yields <code>'1.3'</code>, but <code>repr(1.33)</code> yields <code>'1.3300000000000001'</code>. A accurate string representation of a floating point number produces a variable length string. Because numpy arrays consist of elements that are all the same size, numpy requires you to specify the length of the strings within the array when you're using string arrays. If you use <code>x.astype('str')</code>, it will always convert things to an array of strings of length 1. For example, using <code>x = np.array(1.344566)</code>, <code>x.astype('str')</code> yields <code>'1'</code>! You need to be more explict and use the <code>'|Sx'</code> dtype syntax, where <code>x</code> is the length of the string for each element of the array. For example, use <code>x.astype('|S10')</code> to convert the array to strings of length 10. Even better, just avoid using numpy arrays of strings altogether. It's usually a bad idea, and there's no reason I can see from your description of your problem to use them in the first place...

If you have an array of <code>numbers</code> and you want an array of <code>strings</code>, you can write: <pre class="prettyprint"><code>strings = ["%.2f" % number for number in numbers] </code></pre> If your numbers are floats, the array would be an array with the same numbers as strings with two decimals. <pre class="prettyprint"><code>>>> a = [1,2,3,4,5] >>> min_a, max_a = min(a), max(a) >>> a_normalized = [float(x-min_a)/(max_a-min_a) for x in a] >>> a_normalized [0.0, 0.25, 0.5, 0.75, 1.0] >>> a_strings = ["%.2f" % x for x in a_normalized] >>> a_strings ['0.00', '0.25', '0.50', '0.75', '1.00'] </code></pre> Notice that it also works with <code>numpy</code> arrays: <pre class="prettyprint"><code>>>> a = numpy.array([0.0, 0.25, 0.75, 1.0]) >>> print ["%.2f" % x for x in a] ['0.00', '0.25', '0.50', '0.75', '1.00'] </code></pre> A similar methodology can be used if you have a multi-dimensional array: <pre class="prettyprint"><code>new_array = numpy.array(["%.2f" % x for x in old_array.reshape(old_array.size)]) new_array = new_array.reshape(old_array.shape) </code></pre> Example: <pre class="prettyprint"><code>>>> x = numpy.array([[0,0.1,0.2],[0.3,0.4,0.5],[0.6, 0.7, 0.8]]) >>> y = numpy.array(["%.2f" % w for w in x.reshape(x.size)]) >>> y = y.reshape(x.shape) >>> print y [['0.00' '0.10' '0.20'] ['0.30' '0.40' '0.50'] ['0.60' '0.70' '0.80']] </code></pre> If you check the Matplotlib example for the function you are using, you will notice they use a similar methodology: build empty matrix and fill it with strings built with the interpolation method. The relevant part of the referenced code is: <pre class="prettyprint"><code>colortuple = ('y', 'b') colors = np.empty(X.shape, dtype=str) for y in range(ylen): for x in range(xlen): colors[x, y] = colortuple[(x + y) % len(colortuple)] surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, facecolors=colors, linewidth=0, antialiased=False) </code></pre>

Numpy converting array from float to strings

Tags:

I have an array of floats that I have normalised to one (i.e. the largest number in the array is 1), and I wanted to use it as colour indices for a graph. In using matplotlib to use grayscale, this requires using strings between 0 and 1, so I wanted to convert the array of floats to an array of strings. I was attempting to do this by using "astype('str')", but this appears to create some values that are not the same (or even close) to the originals.

I notice this because matplotlib complains about finding the number 8 in the array, which is odd as it was normalised to one!

In short, I have an array phis, of float64, such that:

numpy.where(phis.astype('str').astype('float64') != phis)

is non empty. This is puzzling as (hopefully naively) it appears to be a bug in numpy, is there anything that I could have done wrong to cause this?

Edit: after investigation this appears to be due to the way the string function handles high precision floats. Using a vectorized toString function (as from robbles answer), this is also the case, however if the lambda function is:

lambda x: "%.2f" % x

Then the graphing works - curiouser and curiouser. (Obviously the arrays are no longer equal however!)

625

asked Mar 19 '11 23:03

V.S.

2 Answers

You seem a bit confused as to how numpy arrays work behind the scenes. Each item in an array must be the same size.

The string representation of a float doesn't work this way. For example, repr(1.3) yields '1.3', but repr(1.33) yields '1.3300000000000001'.

A accurate string representation of a floating point number produces a variable length string.

Because numpy arrays consist of elements that are all the same size, numpy requires you to specify the length of the strings within the array when you're using string arrays.

If you use x.astype('str'), it will always convert things to an array of strings of length 1.

For example, using x = np.array(1.344566), x.astype('str') yields '1'!

You need to be more explict and use the '|Sx' dtype syntax, where x is the length of the string for each element of the array.

For example, use x.astype('|S10') to convert the array to strings of length 10.

Even better, just avoid using numpy arrays of strings altogether. It's usually a bad idea, and there's no reason I can see from your description of your problem to use them in the first place...

121

answered Sep 17 '22 12:09

Joe Kington

If you have an array of numbers and you want an array of strings, you can write:

strings = ["%.2f" % number for number in numbers]

If your numbers are floats, the array would be an array with the same numbers as strings with two decimals.

>>> a = [1,2,3,4,5] >>> min_a, max_a = min(a), max(a) >>> a_normalized = [float(x-min_a)/(max_a-min_a) for x in a] >>> a_normalized [0.0, 0.25, 0.5, 0.75, 1.0] >>> a_strings = ["%.2f" % x for x in a_normalized] >>> a_strings ['0.00', '0.25', '0.50', '0.75', '1.00']

Notice that it also works with numpy arrays:

>>> a = numpy.array([0.0, 0.25, 0.75, 1.0]) >>> print ["%.2f" % x for x in a] ['0.00', '0.25', '0.50', '0.75', '1.00']

A similar methodology can be used if you have a multi-dimensional array:

new_array = numpy.array(["%.2f" % x for x in old_array.reshape(old_array.size)]) new_array = new_array.reshape(old_array.shape)

Example:

>>> x = numpy.array([[0,0.1,0.2],[0.3,0.4,0.5],[0.6, 0.7, 0.8]]) >>> y = numpy.array(["%.2f" % w for w in x.reshape(x.size)]) >>> y = y.reshape(x.shape) >>> print y [['0.00' '0.10' '0.20']  ['0.30' '0.40' '0.50']  ['0.60' '0.70' '0.80']]

If you check the Matplotlib example for the function you are using, you will notice they use a similar methodology: build empty matrix and fill it with strings built with the interpolation method. The relevant part of the referenced code is:

colortuple = ('y', 'b') colors = np.empty(X.shape, dtype=str) for y in range(ylen):     for x in range(xlen):         colors[x, y] = colortuple[(x + y) % len(colortuple)]  surf = ax.plot_surface(X, Y, Z, rstride=1, cstride=1, facecolors=colors,         linewidth=0, antialiased=False)

answered Sep 20 '22 12:09

Escualo

Related questions
                            
                                How do you watch or evaluate an expression in xcode like visual studio's immediate window?
                            
                                How do I read in lines from a text file in OCaml?
                            
                                RSpec Stubbing: Return the parameter
                            
                                How to recursively search for files with certain extensions?
                            
                                Yii: Render action using different layout than controller's layout
                            
                                How to split a variable by a special character
                            
                                How to send "Ctrl + c" in Sikuli?
                            
                                SQLDeveloper displays no tables under connections where it says tables
                            
                                calculating the number of “inversions” in a permutation
                            
                                How to parse URL parameters in Clojure?
                            
                                SQL Server: Get current user without domain
                            
                                Disable Publishing in MSBuild

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With