I'm not sure when to use int()
and when to use .astype('int')
. Can anyone explain?
Is it just int()
is used for single values and .astype('int')
is used for vectors? I'm coming from an R background so I'm used to using as.integer
The astype() method returns a new DataFrame where the data types has been changed to the specified type. You can cast the entire DataFrame to one specific data type, or you can use a Python Dictionary to specify a data type for each column, like this: { 'Duration': 'int64', 'Pulse' : 'float', 'Calories': 'int64' }
Use a numpy. dtype or Python type to cast entire pandas object to the same type. Alternatively, use {col: dtype, …}, where col is a column label and dtype is a numpy. dtype or Python type to cast one or more of the DataFrame's columns to column-specific types.
.astype()
is a method within numpy.ndarray
, as well as the Pandas Series
class, so can be used to convert vectors, matrices and columns within a DataFrame
. However, int()
is a pure-Python function that can only be applied to scalar values.
For example, you can do int(3.14)
, but can't do (2.7).astype('int')
, because Python native types don't have any such method. However, numpy.array([1.1, 2.2, 3.3]).astype('int')
is valid.
(Strictly, it is also possible to define an __int__()
method within one's own classes, which would allow int()
to be applied to non-native types. Thanks to @juanpa.arrivillaga for pointing this out.)
astype is a numpy function, as @rwp points out. It is defined as:
def astype(self, typecode):
""
return self._rc(self.array.astype(typecode))
._rc is defined as:
def _rc(self, a):
if len(shape(a)) == 0:
return a
else:
return self.__class__(a)
In English, this means that if your array has no shape -- i.e. it's a list -- it returns the array, else it returns the array itself cast to the indicated type.
int is a python builtin. It only deals with scalars.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With