Appending to Pandas DataFrame with categorical columns

Question

How do I append to a Pandas DataFrame containing predefined columns of categorical datatype:

df=pd.DataFrame([],columns=['a','b'])
df['a']=pd.Categorical([],categories=[0,1])

new_df=pd.DataFrame.from_dict({'a':[1],'b':[0]})
df.append(new_df)

The above throws me an error:

ValueError: all the input arrays must have same number of dimensions

Update: if the categories are strings as opposed to ints, appending seems to work:

df['a']=pd.Categorical([],categories=['Left','Right'])

new_df=pd.DataFrame.from_dict({'a':['Left'],'b':[0]})
df.append(new_df)

So, how do I append to DataFrames with categories of int values? Secondly, I presumed that with binary values (0/1), storing the column as Categorical instead of numeric datatype would be more efficient or faster. Is this true? If not, I may not even bother to convert my columns to Categorical type.

Anwar Shaikh · Accepted Answer

You have to keep the both data frames consistent. As you are converting the column a from first data frame as categorical, you need do the same for second data frame. You can do it as following-

import pandas as pd

df=pd.DataFrame([],columns=['a', 'b'])
df['a']=pd.Categorical([],[0, 1])

new_df=pd.DataFrame.from_dict({'a':[0,1,1,1,0,0],'b':[1,1,8,4,0,0]})
new_df['a'] = pd.Categorical(new_df['a'],[0, 1])

df.append(new_df, ignore_index=True)

Hope this helps.

Appending to Pandas DataFrame with categorical columns

Tags:

python

pandas

wenhoo

1 Answers

Anwar Shaikh

Recent Activity

Donate For Us

Appending to Pandas DataFrame with categorical columns

Tags:

python

pandas

wenhoo

1 Answers

Anwar Shaikh

Related questions

Recent Activity

Donate For Us