I have a dataframe with one col int one col floats:
df
# a b
# 0 3 42.00
# 1 2 3.14
df.dtypes
# a int64
# b float64
# dtype: object
I want a list of dicts like the one provide by df.to_dict(orient='records')
df.to_dict(orient='records')
[{'a': 3.0, 'b': 42.0}, {'a': 2.0, 'b': 3.1400000000000001}]
But with a
as int
, not casted as float
to_dict() method is used to convert a dataframe into a dictionary of series or list like data type depending on orient parameter. Parameters: orient: String value, ('dict', 'list', 'series', 'split', 'records', 'index') Defines which dtype to convert Columns(series into).
The to_dict() function is used to convert the DataFrame to a dictionary. Syntax: DataFrame.to_dict(self, orient='dict', into=<class 'dict'>) Parameters: Name.
To convert pandas DataFrame to Dictionary object, use to_dict() method, this takes orient as dict by default which returns the DataFrame in format {column -> {index -> value}} . When no orient is specified, to_dict() returns in this format.
Pandas can create dataframes from many kinds of data structures—without you having to write lots of lengthy code. One of those data structures is a dictionary.
Currently (as of Pandas version 0.18), df.to_dict('records')
accesses the NumPy array df.values
. This property upcasts the dtype of the int
column to float
so that the array can have a single common dtype. After this point there is no hope of returning the desired result -- all the ints have been converted to floats.
So instead, building on ayhan's and Tom Augspurger's suggestion you could use a list and dict comprehension:
import pandas as pd
df = pd.DataFrame({'a':[3,2], 'b':[42.0,3.14]})
result = [{col:getattr(row, col) for col in df} for row in df.itertuples()]
print(result)
# [{'a': 3, 'b': 42.0}, {'a': 2, 'b': 3.1400000000000001}]
Another horrible workaround is to (temporarily) add a non-numeric column, e.g. starting with:
df = pd.DataFrame([[1, 2.4], [3, 4.0]], columns='a b'.split())
then df.to_dict(orient='record')
promotes to floats, but if you do:
df['foo'] = 'bar'
[{k: v for (k, v) in row.items() if k != 'foo'} for row in df.to_dict(orient='record')]
you preserve the original types. I notice that df.reindex()
behaves similarly, as explained in the Pandas gotchas but you can't workaround unless you fill with non-nil values, e.g. fill_value=0
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With