Python & Pandas - pd.Series difference between int32 and int64

Tags:

I'm starting to learn python, numpy and panda's and I have a really basic question, about sizes.

Please see the next code blocks:

1. Length: 6, dtype: int64

# create a Series from a dict
pd.Series({key: value for key, value in zip('abcdef', range(6))})

vs.

2. Length: 6, dtype: int32

# but why does this generate a smaller integer size???
pd.Series(range(6), index=list('abcdef'))

Question So I think when you put a list, numpy array, dictionary etc. in the pd.Series you will get int64 but when you put just the range(6) in the pd.Series you will get int32. Can someone please make this a little bit clear to me?

Sorry for the very basic question.

@Edit : I'm using Pandas version 0.20.1 and Numpy 1.12.1

615

asked Sep 15 '17 13:09

Mike Evers

1 Answers

They're semantically different in that in the first version you pass a dict with a single scalar value so the dtype becomes int64, for the second, you pass a range which can be trvially converted to a numpy array and this is int32:

In[57]:
np.array(range(6)).dtype

Out[57]: dtype('int32')

So the construction of the pandas seriesinvolves a dtype matching in the first instance and none for the second because it's convertible to a numpy array and numpy has determined that int32 is preferred in this case

update

It looks like this is dependant on your numpy version and maybe pandas version. I'm running python 3.6, numpy 1.12.1 and pandas 0.20.3 and I get the above result. I'm also running Windows 7 64-bit

@jeremycg is running pandas 0.19.2 and numpy 1.11.2 and observes the same result whilst @coldspeed is running numpy 1.13.1 and observes int64.

The takeaway from this that the dtype will largely be determined by what numpy does.

I believe that this line is what is called when we pass range in this case.

subarr = np.array(arr, dtype=object, copy=copy)

The returned type is determined by numpy and OS, in my case windows has defined a C Long as being 32-bits. See related: numpy array dtype is coming as int32 by default in a windows 10 64 bit machine

160

answered Sep 21 '22 13:09

EdChum

Related questions
                            
                                Pandas/matplotlib plot with date-axis shows correct day/month but wrong weekday/year
                            
                                URL based database routing
                            
                                How to send data via POST or GET in Mod_Python?
                            
                                What is output tensor of Max Pooling 2D Layer in TensorFlow?
                            
                                Pandas DataFrame: How to calculate the difference by first row and last row in group?
                            
                                Recurrent Neural Network (RNN) - Forget Layer, and TensorFlow
                            
                                Install Python Packages in Azure ML?
                            
                                Accessing gradient values of keras model outputs with respect to inputs
                            
                                How to determine which port aiohttp selects when given port=0
                            
                                Apply function to each cell in DataFrame in place in pandas
                            
                                How to get notifications from BLE Device using pygatt in python?
                            
                                tensorflow divide with 0/0=:0
                            
                                docker-compose volume not mounting correctly
                            
                                Convert ppt file to pptx in Python
                            
                                How to use pytest-aiohttp fixtures with scope session
                            
                                keras load_model raise error when executed a second time
                            
                                PySpark: PicklingError: Could not serialize object: TypeError: can't pickle CompiledFFI objects
                            
                                Flask SQLAlchemy - StaticMethod vs Custom Querying Class
                            
                                How do I attach VS Code's Python debugger to a running process?
                            
                                Keep columns after a groupby in an empty dataframe

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With

Python & Pandas - pd.Series difference between int32 and int64

Tags:

python

pandas

numpy

data-analysis

Mike Evers

People also ask

1 Answers

EdChum

Recent Activity

Donate For Us