How can one modify the format for the output from a groupby operation in pandas that produces scientific notation for very large numbers?
I know how to do string formatting in python but I'm at a loss when it comes to applying it here.
df1.groupby('dept')['data1'].sum() dept value1 1.192433e+08 value2 1.293066e+08 value3 1.077142e+08
This suppresses the scientific notation if I convert to string but now I'm just wondering how to string format and add decimals.
sum_sales_dept.astype(str)
There is no direct way to configure and stop scientific notation in spark however you can apply format_number function to display number in proper decimal format rather than exponential format.
strip() function is used to remove leading and trailing characters. Strip whitespaces (including newlines) or a set of specified characters from each string in the Series/Index from left and right sides. Equivalent to str. strip().
Granted, the answer I linked in the comments is not very helpful. You can specify your own string converter like so.
In [25]: pd.set_option('display.float_format', lambda x: '%.3f' % x) In [28]: Series(np.random.randn(3))*1000000000 Out[28]: 0 -757322420.605 1 -1436160588.997 2 -1235116117.064 dtype: float64
I'm not sure if that's the preferred way to do this, but it works.
Converting numbers to strings purely for aesthetic purposes seems like a bad idea, but if you have a good reason, this is one way:
In [6]: Series(np.random.randn(3)).apply(lambda x: '%.3f' % x) Out[6]: 0 0.026 1 -0.482 2 -0.694 dtype: object
Here is another way of doing it, similar to Dan Allan's answer but without the lambda function:
>>> pd.options.display.float_format = '{:.2f}'.format >>> Series(np.random.randn(3)) 0 0.41 1 0.99 2 0.10
or
>>> pd.set_option('display.float_format', '{:.2f}'.format)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With