Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Pandas to_json changing data type

I noticed this behavior, not sure it's a bug. I create a dataframe with 2 integer columns and 1 float column

import pandas as pd
df = pd.DataFrame([[1,2,0.2],[3,2,0.1]])
df.info()


<class 'pandas.core.frame.DataFrame'>
Int64Index: 2 entries, 0 to 1
Data columns (total 3 columns):
0    2 non-null int64
1    2 non-null int64
2    2 non-null float64
dtypes: float64(1), int64(2)

If I output that to Json, the dtype information is lost:

df.to_json(orient= 'records')

'[{"0":1.0,"1":2.0,"2":0.2},{"0":3.0,"1":2.0,"2":0.1}]'

All data is converted to float. This is a problem if for example one column contains ns timestamps, because they are converted to exponential notation and the sub-second information is lost.

I also filed the issue here: https://github.com/pydata/pandas/issues/7583

The result I was expecting is:

'[{"0":1,"1":2,"2":0.2},{"0":3,"1":2,"2":0.1}]'
like image 321
Fra Avatar asked Jun 27 '14 00:06

Fra


People also ask

How do I change pandas data type?

Change column type in pandas using DataFrame.apply() to_numeric, pandas. to_datetime, and pandas. to_timedelta as arguments to apply the apply() function to change the data type of one or more columns to numeric, DateTime, and time delta respectively.

Can pandas series hold different data types?

In the same way you can't attach a specific data type to list , even if all elements are of the same type, a Pandas object series contains pointers to any number of types.

How do you change the dataset type in Python?

astype() method. We can pass any Python, Numpy or Pandas datatype to change all columns of a dataframe to that type, or we can pass a dictionary having column names as keys and datatype as values to change type of selected columns.

Can pandas DataFrame have different data types?

Pandas uses other names for data types than Python, for example: object for textual data. A column in a DataFrame can only have one data type. The data type in a DataFrame's single column can be checked using dtype . Make conscious decisions about how to manage missing data.


1 Answers

One way is to view the DataFrame columns with object dtype:

In [11]: df1 = df.astype(object)

In [12]: df1.to_json()
Out[12]: '{"0":{"0":1,"1":3},"1":{"0":2,"1":2},"2":{"0":0.2,"1":0.1}}'

In [13]: df1.to_json(orient='records')
Out[13]: '[{"0":1,"1":2,"2":0.2},{"0":3,"1":2,"2":0.1}]'
like image 54
Andy Hayden Avatar answered Oct 01 '22 18:10

Andy Hayden