Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas - make a column dtype object or Factor

Tags:

python

pandas

In pandas, how can I convert a column of a DataFrame into dtype object? Or better yet, into a factor? (For those who speak R, in Python, how do I as.factor()?)

Also, what's the difference between pandas.Factor and pandas.Categorical?

like image 541
N. McA. Avatar asked Mar 30 '13 21:03

N. McA.


People also ask

How do I convert a column to Dtype?

The dtype specified can be a buil-in Python, numpy , or pandas dtype. Let's suppose we want to convert column A (which is currently a string of type object ) into a column holding integers. To do so, we simply need to call astype on the pandas DataFrame object and explicitly define the dtype we wish to cast the column.


2 Answers

You can use the astype method to cast a Series (one column):

df['col_name'] = df['col_name'].astype(object) 

Or the entire DataFrame:

df = df.astype(object) 

Update

Since version 0.15, you can use the category datatype in a Series/column:

df['col_name'] = df['col_name'].astype('category') 

Note: pd.Factor was been deprecated and has been removed in favor of pd.Categorical.

like image 120
Andy Hayden Avatar answered Oct 25 '22 18:10

Andy Hayden


There's also pd.factorize function to use:

# use the df data from @herrfz  In [150]: pd.factorize(df.b) Out[150]: (array([0, 1, 0, 1, 2]), array(['yes', 'no', 'absent'], dtype=object)) In [152]: df['c'] = pd.factorize(df.b)[0]  In [153]: df Out[153]:     a       b  c 0  1     yes  0 1  2      no  1 2  3     yes  0 3  4      no  1 4  5  absent  2 
like image 31
piggybox Avatar answered Oct 25 '22 16:10

piggybox