Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Appending to a DataFrame converts dtypes

Tags:

python

pandas

I am appending to a pandas.DataFrame, and the dtype of a column is converted in an unexpected way:

import pandas as pd
df=pd.DataFrame({'a':1.0, 'b':'x'}, index=[0])
print df.dtypes
df = df.append({'a':3.0}, ignore_index=True)
print df.dtypes
df = df.append({'a':3.0, 'b':'x'}, ignore_index=True)
print df.dtypes

Output:

a    float64
b     object
dtype: object
a    float64
b     object
dtype: object
a    object         <- ???
b    object
dtype: object

whereas I would have expected a float64 instead of that object. How can I avoid that conversion?

I am using pandas 0.11.

like image 612
Yariv Avatar asked Jan 22 '14 11:01

Yariv


1 Answers

Try this, convert the dict object to DataFrame first:

import pandas as pd
df=pd.DataFrame({'a':1.0, 'b':'x'}, index=[0])
print df.dtypes
df = df.append({'a':3.0}, ignore_index=True)
print df.dtypes
df = df.append(pd.DataFrame([{'a':3.0, 'b':'x'}]), ignore_index=True)
print df.dtypes

or, a list of dict:

df = df.append([{'a':3.0, 'b':'x'}], ignore_index=True)

If it's a dict, it will be convert to a Series first, a series contain 3.0 and 'x' must with object dtype.

If it's a list of dict, it will be convert to a DataFrame, DataFrame can have different dtype for every column.

like image 197
HYRY Avatar answered Sep 27 '22 22:09

HYRY