I have a dataframe with one col int one col floats: <pre class="prettyprint"><code>df # a b # 0 3 42.00 # 1 2 3.14 df.dtypes # a int64 # b float64 # dtype: object </code></pre> I want a list of dicts like the one provide by <code>df.to_dict(orient='records')</code> <pre class="prettyprint"><code>df.to_dict(orient='records') [{'a': 3.0, 'b': 42.0}, {'a': 2.0, 'b': 3.1400000000000001}] </code></pre> But with <code>a</code> as <code>int</code>, not casted as float

Currently (as of Pandas version 0.18), <code>df.to_dict('records')</code> accesses the NumPy array <code>df.values</code>. This property upcasts the dtype of the <code>int</code> column to <code>float</code> so that the array can have a single common dtype. After this point there is no hope of returning the desired result -- all the ints have been converted to floats. So instead, building on ayhan's and Tom Augspurger's suggestion you could use a list and dict comprehension: <pre class="prettyprint"><code>import pandas as pd df = pd.DataFrame({'a':[3,2], 'b':[42.0,3.14]}) result = [{col:getattr(row, col) for col in df} for row in df.itertuples()] print(result) # [{'a': 3, 'b': 42.0}, {'a': 2, 'b': 3.1400000000000001}] </code></pre>

Another horrible workaround is to (temporarily) add a non-numeric column, e.g. starting with: <pre class="prettyprint"><code>df = pd.DataFrame([[1, 2.4], [3, 4.0]], columns='a b'.split()) </code></pre> then <code>df.to_dict(orient='record')</code> promotes to floats, but if you do: <pre class="prettyprint"><code>df['foo'] = 'bar' [{k: v for (k, v) in row.items() if k != 'foo'} for row in df.to_dict(orient='record')] </code></pre> you preserve the original types. I notice that <code>df.reindex()</code> behaves similarly, as explained in the Pandas gotchas but you can't workaround unless you fill with non-nil values, e.g. <code>fill_value=0</code>

get python pandas to_dict with orient='records' but without float cast

Tags:

python

pandas

I have a dataframe with one col int one col floats:

df
#    a      b
# 0  3  42.00
# 1  2   3.14

df.dtypes
# a      int64
# b    float64
# dtype: object

I want a list of dicts like the one provide by df.to_dict(orient='records')

df.to_dict(orient='records')
[{'a': 3.0, 'b': 42.0}, {'a': 2.0, 'b': 3.1400000000000001}]

But with a as int, not casted as float

285

asked Jun 18 '16 13:06

user3313834

2 Answers

Currently (as of Pandas version 0.18), df.to_dict('records') accesses the NumPy array df.values. This property upcasts the dtype of the int column to float so that the array can have a single common dtype. After this point there is no hope of returning the desired result -- all the ints have been converted to floats.

So instead, building on ayhan's and Tom Augspurger's suggestion you could use a list and dict comprehension:

import pandas as pd

df = pd.DataFrame({'a':[3,2], 'b':[42.0,3.14]})
result = [{col:getattr(row, col) for col in df} for row in df.itertuples()]
print(result)
# [{'a': 3, 'b': 42.0}, {'a': 2, 'b': 3.1400000000000001}]

142

answered Oct 18 '22 22:10

unutbu

Another horrible workaround is to (temporarily) add a non-numeric column, e.g. starting with:

df = pd.DataFrame([[1, 2.4], [3, 4.0]], columns='a b'.split())

then df.to_dict(orient='record') promotes to floats, but if you do:

df['foo'] = 'bar'
[{k: v for (k, v) in row.items() if k != 'foo'} for row in df.to_dict(orient='record')]

you preserve the original types. I notice that df.reindex() behaves similarly, as explained in the Pandas gotchas but you can't workaround unless you fill with non-nil values, e.g. fill_value=0

answered Oct 18 '22 22:10

patricksurry

Related questions
                            
                                How to plot kernel density plot of dates in Pandas?
                            
                                Passing default arguments to a decorator in python
                            
                                when does `datetime.now(pytz_timezone)` fail?
                            
                                Why is this generator expression function slower than the loop version?
                            
                                Running django tests with selenium in docker
                            
                                Polymorphic Model Inheritance in Django
                            
                                Joining ManyToMany fields with prefetch_related in Django
                            
                                Detect all global variables within a python function?
                            
                                Python strptime parsing year without century: assume prior to this year?
                            
                                Cython: using imported class in a type declaration
                            
                                Keeping NaNs with pandas dataframe inequalities
                            
                                Boto3: Configuration file location
                            
                                Index levels doubled when using groupby/apply on a multiindexed dataframe
                            
                                How to add custom error codes to Django Rest Framework
                            
                                Call Python function from c# (.NET)
                            
                                Error while upgrading pip: UnicodeDecodeError: 'utf-8' codec can't decode byte
                            
                                How can I mock a method globally for all tests in python
                            
                                Kivy: compiling to a single executable
                            
                                Does Anaconda 4.0.2 already runs numpy on MKL
                            
                                Get fully qualified name of a Python class (Python 3.3+)

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With