Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why does pandas.DataFrame.update change the dtypes of the updated dataframe?

Tags:

python

pandas

I would like to keep the dtypes of the columns as int after the update for obvious reasons. Any ideas why this doesn't work as expected?

import pandas as pd

df1 = pd.DataFrame([
    {'a': 1, 'b': 2, 'c': 'foo'},
    {'a': 3, 'b': 4, 'c': 'baz'},
])

df2 = pd.DataFrame([
    {'a': 1, 'b': 8, 'c': 'bar'},
])

print 'dtypes before update:\n%s\n%s' % (df1.dtypes, df2.dtypes)

df1.update(df2)

print '\ndtypes after update:\n%s\n%s' % (df1.dtypes, df2.dtypes)

The output looks like this:

dtypes before update:
a     int64
b     int64
c    object
dtype: object
a     int64
b     int64
c    object
dtype: object

dtypes after update:
a    float64
b    float64
c     object
dtype: object
a     int64
b     int64
c    object
dtype: object

Thanks to anyone that has some advise

like image 379
Brendan Maguire Avatar asked Jan 29 '15 14:01

Brendan Maguire


People also ask

Does the rename () always makes changes in the default DataFrame?

If it is true, it makes the changes in the original DataFrame. The default value of the inplace is True. level: It refers to an int or level name values that specify the level, if DataFrame has a multiple level index. The default value of the level is None.

How do I change Dtypes in pandas DataFrame?

to_numeric() The best way to convert one or more columns of a DataFrame to numeric values is to use pandas. to_numeric() . This function will try to change non-numeric objects (such as strings) into integers or floating-point numbers as appropriate.

What does Dtypes do in pandas?

To check the data type in pandas DataFrame we can use the “dtype” attribute. The attribute returns a series with the data type of each column. And the column names of the DataFrame are represented as the index of the resultant series object and the corresponding data types are returned as values of the series object.

What does DF update do?

The update() method updates a DataFrame with elements from another similar object (like another DataFrame).


1 Answers

This is a known issue. https://github.com/pydata/pandas/issues/4094 I think your only option currently is calling astype(int) after the update.

like image 68
JAB Avatar answered Nov 04 '22 00:11

JAB