Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Convert all columns from int64 to int32

Tags:

python

pandas

We all now the question: Change data type of columns in Pandas where it is really nice explained how to change the data type of a column, but what if I have a dataframe df with the following df.dtypes:

A  object
B   int64
C   int32
D   object
E   int64
F  float32

How could I change this without explicity mention the column names that all int64 types are converted to int32 types? So the desired outcome is:

A  object
B   int32
C   int32
D   object
E   int32
F  float32
like image 864
PV8 Avatar asked Jan 29 '20 12:01

PV8


People also ask

How do I change the datatype of a column in pandas?

You can change the column type in pandas dataframe using the df. astype() method. Once you create a dataframe, you may need to change the column type of a dataframe for reasons like converting a column to a number format which can be easily used for modeling and classification.

How do I change a column to numeric in pandas?

to_numeric() The best way to convert one or more columns of a DataFrame to numeric values is to use pandas. to_numeric() . This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.


2 Answers

You can create dictionary by all columns with int64 dtype by DataFrame.select_dtypes and convert it to int32 by DataFrame.astype, but not sure if not fail if big integers numbers:

df = pd.DataFrame({
        'A':list('abcdef'),
         'B':[4,5,4,5,5,4],
         'C':[7,8,9,4,2,3],
         'D':[1,3,5,7,1,0],
         'E':[5,3,6,9,2,4],
         'F':list('aaabbb')
})


d = dict.fromkeys(df.select_dtypes(np.int64).columns, np.int32)
df = df.astype(d)
print (df.dtypes)
A    object
B     int32
C     int32
D     int32
E     int32
F    object
dtype: object
like image 121
jezrael Avatar answered Nov 15 '22 06:11

jezrael


Use DataFrame.select_dtypes and DataFrame.astype:

# example dataframe
df = pd.DataFrame({'A':list('abc'),
                   'B':[1,2,3],
                   'C':[4,5,6]})

   A  B  C
0  a  1  4
1  b  2  5
2  c  3  6
# as we can see, the integer columns are int64
print(df.dtypes)
A    object
B     int64
C     int64
dtype: object
df = df.astype({col: 'int32' for col in df.select_dtypes('int64').columns})

# int64 columns have been converted to int32
print(df.dtypes)
A    object
B     int32
C     int32
dtype: object
like image 36
Erfan Avatar answered Nov 15 '22 04:11

Erfan