Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

pandas dtype conversion from object to string

Tags:

python

pandas

I have a csv file that has a few columns which are numbers and few that are string. When I try myDF.dtypes it shows me all the string columns as object.

  1. Someone asked a related question before here about why this is done. Is it possible to recast the dtype from object to string?

  2. Also, in general, is there any easy way to recast the dtype from int64 and float64 to int32 and float32 and save on the size of the data (in memory / on disk)?

like image 352
uday Avatar asked Feb 17 '14 23:02

uday


1 Answers

All strings are represented as variable-length (which is what object dtype is holding). You can do series.astype('S32') if you want; but it will be recast if you then store it in a DataFrame or do much with it. This is for simplicity.

Certain serialization formats, e.g. HDFStore stores the strings as fixed-length strings on disk though.

You can series.astype(int32) if you would like and it will store as the new type.

like image 139
Jeff Avatar answered Oct 15 '22 20:10

Jeff