Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Format / Suppress Scientific Notation from Python Pandas Aggregation Results

How can one modify the format for the output from a groupby operation in pandas that produces scientific notation for very large numbers?

I know how to do string formatting in python but I'm at a loss when it comes to applying it here.

df1.groupby('dept')['data1'].sum()  dept value1       1.192433e+08 value2       1.293066e+08 value3       1.077142e+08 

This suppresses the scientific notation if I convert to string but now I'm just wondering how to string format and add decimals.

sum_sales_dept.astype(str) 
like image 643
horatio1701d Avatar asked Jan 15 '14 12:01

horatio1701d


People also ask

How do you stop scientific notation in Pyspark?

There is no direct way to configure and stop scientific notation in spark however you can apply format_number function to display number in proper decimal format rather than exponential format.

How do I strip text in pandas?

strip() function is used to remove leading and trailing characters. Strip whitespaces (including newlines) or a set of specified characters from each string in the Series/Index from left and right sides. Equivalent to str. strip().


2 Answers

Granted, the answer I linked in the comments is not very helpful. You can specify your own string converter like so.

In [25]: pd.set_option('display.float_format', lambda x: '%.3f' % x)  In [28]: Series(np.random.randn(3))*1000000000 Out[28]:  0    -757322420.605 1   -1436160588.997 2   -1235116117.064 dtype: float64 

I'm not sure if that's the preferred way to do this, but it works.

Converting numbers to strings purely for aesthetic purposes seems like a bad idea, but if you have a good reason, this is one way:

In [6]: Series(np.random.randn(3)).apply(lambda x: '%.3f' % x) Out[6]:  0     0.026 1    -0.482 2    -0.694 dtype: object 
like image 152
Dan Allan Avatar answered Sep 24 '22 15:09

Dan Allan


Here is another way of doing it, similar to Dan Allan's answer but without the lambda function:

>>> pd.options.display.float_format = '{:.2f}'.format >>> Series(np.random.randn(3)) 0    0.41 1    0.99 2    0.10 

or

>>> pd.set_option('display.float_format', '{:.2f}'.format) 
like image 28
tfhans Avatar answered Sep 25 '22 15:09

tfhans