Context: I would like to use <code>numpy ndarrays</code> with <code>float32</code> instead of <code>float64</code>. Edit: Additional context - I'm concerned about how <code>numpy</code> is executing these calls because they will be happening repeatedly as part of a backpropagation routine in a neural net. I'd like the net to carry out all addition/subtraction/multiplication/division in <code>float32</code> for validation purposes, as I want to compare results with another group's work. It seems like initialization for methods like <code>randn</code> will always go from <code>float64</code> -> <code>float32</code> with <code>.astype()</code> casting. Once my <code>ndarray</code> is of type <code>float32</code> if i use <code>np.dot</code> for example will those multiplications happen in <code>float32</code>? How can I verify? The documentation is not clear to me - http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html I figured out I can just add <code>.astype('float32')</code> to the end of a numpy call, for example, <code>np.random.randn(y, 1).astype('float32')</code>. I also see that <code>dtype=np.float32</code> is an option, for example, <code>np.zeros(5, dtype=np.float32)</code>. However, trying <code>np.random.randn((y, 1), dtype=np.float32)</code> returns the following error: <pre class="prettyprint"><code> b = np.random.randn((3,1), dtype=np.float32) TypeError: randn() got an unexpected keyword argument 'dtype' </code></pre> What is the difference between declaring the type as <code>float32</code> using <code>dtype</code> and using <code>.astype()</code>? Both <code>b = np.zeros(5, dtype=np.float32)</code> and <code>b = np.zeros(5).astype('float32')</code> when evaluated with: <pre class="prettyprint"><code>print(type(b)) print(b[0]) print(type(b[0])) </code></pre> prints: <pre class="prettyprint"><code>[ 0. 0. 0. 0. 0.] <class 'numpy.ndarray'> 0.0 <class 'numpy.float32'> </code></pre>

Let's see if I can address some of the confusion I'm seeing in the comments. Make an array: <pre class="prettyprint"><code>In [609]: x=np.arange(5) In [610]: x Out[610]: array([0, 1, 2, 3, 4]) In [611]: x.dtype Out[611]: dtype('int32') </code></pre> The default for <code>arange</code> is to make an int32. <code>astype</code> is an array method; it can used on any array: <pre class="prettyprint"><code>In [612]: x.astype(np.float32) Out[612]: array([ 0., 1., 2., 3., 4.], dtype=float32) </code></pre> <code>arange</code> also takes a <code>dtype</code> parameter <pre class="prettyprint"><code>In [614]: np.arange(5, dtype=np.float32) Out[614]: array([ 0., 1., 2., 3., 4.], dtype=float32) </code></pre> whether it created the int array first and converted it, or made the float32 directly isn't any concern to me. This is a basic operation, done in compiled code. I can also give it a float <code>stop</code> value, in which case it will give me a float array - the default float type. <pre class="prettyprint"><code>In [615]: np.arange(5.0) Out[615]: array([ 0., 1., 2., 3., 4.]) In [616]: _.dtype Out[616]: dtype('float64') </code></pre> <code>zeros</code> is similar; the default dtype is float64, but with a parameter I can change that. Since its primary task with to allocate memory, and it doesn't have to do any calculation, I'm sure it creates the desired dtype right away, without further conversion. But again, this is compiled code, and I shouldn't have to worry about what it is doing under the covers. <pre class="prettyprint"><code>In [618]: np.zeros(5) Out[618]: array([ 0., 0., 0., 0., 0.]) In [619]: _.dtype Out[619]: dtype('float64') In [620]: np.zeros(5,dtype=np.float32) Out[620]: array([ 0., 0., 0., 0., 0.], dtype=float32) </code></pre> <code>randn</code> involves a lot of calculation, and evidently it is compiled to work with the default float type. It does not take a dtype. But since the result is an array, it can be cast with <code>astype</code>. <pre class="prettyprint"><code>In [623]: np.random.randn(3) Out[623]: array([-0.64520949, 0.21554705, 2.16722514]) In [624]: _.dtype Out[624]: dtype('float64') In [625]: __.astype(np.float32) Out[625]: array([-0.64520949, 0.21554704, 2.16722512], dtype=float32) </code></pre> Let me stress that <code>astype</code> is a method of an array. It takes the values of the array and produces a new array with the desire dtype. It does not act retroactively (or in-place) on the array itself, or on the function that created that array. The effect of <code>astype</code> is often (always?) the same as a <code>dtype</code> parameter, but the sequence of actions is different. In https://stackoverflow.com/a/39625960/901925 I describe a sparse matrix creator that takes a <code>dtype</code> parameter, and implements it with an <code>astype</code> method call at the end. When you do calculations such as <code>dot</code> or <code>*</code>, it tries to match the output dtype with inputs. In the case of mixed types it goes with the higher precision alternative. <pre class="prettyprint"><code>In [642]: np.arange(5,dtype=np.float32)*np.arange(5,dtype=np.float64) Out[642]: array([ 0., 1., 4., 9., 16.]) In [643]: _.dtype Out[643]: dtype('float64') In [644]: np.arange(5,dtype=np.float32)*np.arange(5,dtype=np.float32) Out[644]: array([ 0., 1., 4., 9., 16.], dtype=float32) </code></pre> There are casting rules. One way to look those up is with <code>can_cast</code> function: <pre class="prettyprint"><code>In [649]: np.can_cast(np.float64,np.float32) Out[649]: False In [650]: np.can_cast(np.float32,np.float64) Out[650]: True </code></pre> It is possible in some calculations that it will cast the 32 to 64, do the calculation, and then cast back to 32. The purpose would be to avoid rounding errors. But I don't know how you find that out from the documentation or tests.

What is the difference between dtype= and .astype() in numpy?

Tags:

python-3.x

numpy

Context: I would like to use numpy ndarrays with float32 instead of float64.

Edit: Additional context - I'm concerned about how numpy is executing these calls because they will be happening repeatedly as part of a backpropagation routine in a neural net. I'd like the net to carry out all addition/subtraction/multiplication/division in float32 for validation purposes, as I want to compare results with another group's work. It seems like initialization for methods like randn will always go from float64 -> float32 with .astype() casting. Once my ndarray is of type float32 if i use np.dot for example will those multiplications happen in float32? How can I verify?

The documentation is not clear to me - http://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html

I figured out I can just add .astype('float32') to the end of a numpy call, for example, np.random.randn(y, 1).astype('float32').

I also see that dtype=np.float32 is an option, for example, np.zeros(5, dtype=np.float32). However, trying np.random.randn((y, 1), dtype=np.float32) returns the following error:

    b = np.random.randn((3,1), dtype=np.float32)
TypeError: randn() got an unexpected keyword argument 'dtype'

What is the difference between declaring the type as float32 using dtype and using .astype()?

Both b = np.zeros(5, dtype=np.float32) and b = np.zeros(5).astype('float32') when evaluated with:

print(type(b))
print(b[0])
print(type(b[0]))

prints:

[ 0.  0.  0.  0.  0.]
<class 'numpy.ndarray'>
0.0
<class 'numpy.float32'>

773

asked Sep 21 '16 17:09

phoenixdown

1 Answers

Let's see if I can address some of the confusion I'm seeing in the comments.

Make an array:

In [609]: x=np.arange(5)
In [610]: x
Out[610]: array([0, 1, 2, 3, 4])
In [611]: x.dtype
Out[611]: dtype('int32')

The default for arange is to make an int32.

astype is an array method; it can used on any array:

In [612]: x.astype(np.float32)
Out[612]: array([ 0.,  1.,  2.,  3.,  4.], dtype=float32)

arange also takes a dtype parameter

In [614]: np.arange(5, dtype=np.float32)
Out[614]: array([ 0.,  1.,  2.,  3.,  4.], dtype=float32)

whether it created the int array first and converted it, or made the float32 directly isn't any concern to me. This is a basic operation, done in compiled code.

I can also give it a float stop value, in which case it will give me a float array - the default float type.

In [615]: np.arange(5.0)
Out[615]: array([ 0.,  1.,  2.,  3.,  4.])
In [616]: _.dtype
Out[616]: dtype('float64')

zeros is similar; the default dtype is float64, but with a parameter I can change that. Since its primary task with to allocate memory, and it doesn't have to do any calculation, I'm sure it creates the desired dtype right away, without further conversion. But again, this is compiled code, and I shouldn't have to worry about what it is doing under the covers.

In [618]: np.zeros(5)
Out[618]: array([ 0.,  0.,  0.,  0.,  0.])
In [619]: _.dtype
Out[619]: dtype('float64')
In [620]: np.zeros(5,dtype=np.float32)
Out[620]: array([ 0.,  0.,  0.,  0.,  0.], dtype=float32)

randn involves a lot of calculation, and evidently it is compiled to work with the default float type. It does not take a dtype. But since the result is an array, it can be cast with astype.

In [623]: np.random.randn(3)
Out[623]: array([-0.64520949,  0.21554705,  2.16722514])
In [624]: _.dtype
Out[624]: dtype('float64')
In [625]: __.astype(np.float32)
Out[625]: array([-0.64520949,  0.21554704,  2.16722512], dtype=float32)

Let me stress that astype is a method of an array. It takes the values of the array and produces a new array with the desire dtype. It does not act retroactively (or in-place) on the array itself, or on the function that created that array.

The effect of astype is often (always?) the same as a dtype parameter, but the sequence of actions is different.

In https://stackoverflow.com/a/39625960/901925 I describe a sparse matrix creator that takes a dtype parameter, and implements it with an astype method call at the end.

When you do calculations such as dot or *, it tries to match the output dtype with inputs. In the case of mixed types it goes with the higher precision alternative.

In [642]: np.arange(5,dtype=np.float32)*np.arange(5,dtype=np.float64)
Out[642]: array([  0.,   1.,   4.,   9.,  16.])
In [643]: _.dtype
Out[643]: dtype('float64')
In [644]: np.arange(5,dtype=np.float32)*np.arange(5,dtype=np.float32)
Out[644]: array([  0.,   1.,   4.,   9.,  16.], dtype=float32)

There are casting rules. One way to look those up is with can_cast function:

In [649]: np.can_cast(np.float64,np.float32)
Out[649]: False
In [650]: np.can_cast(np.float32,np.float64)
Out[650]: True

It is possible in some calculations that it will cast the 32 to 64, do the calculation, and then cast back to 32. The purpose would be to avoid rounding errors. But I don't know how you find that out from the documentation or tests.

148

answered Nov 15 '22 04:11

hpaulj

Related questions
                            
                                ModuleNotFoundError: No module named 'tf_slim'
                            
                                Recover from segfault in Python
                            
                                f-string formatting: display number sign?
                            
                                String replacement on a whole text file in Python 3.x?
                            
                                how to install pycairo for python 3 on Ubuntu 10.04
                            
                                Extracting a number from a 1-word string
                            
                                Why can't I use string functions inside map()?
                            
                                Python: Traceback codecs.charmap_decode(input,self.errors,decoding_table)[0]
                            
                                Python - Get Header information from URL
                            
                                Python 3 How to delete images in a folder
                            
                                How to delete a line from a text file using the line number in python
                            
                                Unable to import decimal in Python 2.7 or Python 3.3 [duplicate]
                            
                                Getting a name error when trying to input a string [duplicate]
                            
                                How to break up one print command in two lines of code in Python 3
                            
                                Find Last Word in a String within a List (Pandas, Python 3)
                            
                                Find and remove a string starting and ending with a specific substring in python
                            
                                Is checking if key is in dictionary and getting it's value in the same "if" safe?
                            
                                Should I use the print statement or function in Python 2.7?
                            
                                elegant way to reduce a list of dictionaries?
                            
                                Inserting an element before each element of a list

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With