Is it possible to read categorical columns with pandas' read_csv?

I have tried passing the dtype parameter with read_csv as dtype={n: pandas.Categorical} but this does not work properly (the result is an Object). The manual is unclear.

1 Answers

In version 0.19.0 you can use parameter dtype='category' in read_csv:

data = 'col1,col2,col3\na,b,1\na,b,2\nc,d,3' df = pd.read_csv(pd.compat.StringIO(data), dtype='category') print (df)   col1 col2 col3 0    a    b    1 1    a    b    2 2    c    d    3  print (df.dtypes) col1    category col2    category col3    category dtype: object 

If want specify column for category use dtype with dictionary:

df = pd.read_csv(pd.compat.StringIO(data), dtype={'col1':'category'}) print (df)   col1 col2  col3 0    a    b     1 1    a    b     2 2    c    d     3  print (df.dtypes) col1    category col2      object col3       int64 dtype: object 
