I have pandas table with two columns with numerical data (dtype flaot64). I have rounded each column to have 2 digits after the decimal point and then used function to round it to the near 0.5 but for some reason only one column got rounded with 0.05 and the second one got rounded but missed the 2nd digit. This is fake example which works and show the flow : <pre class="prettyprint"><code>table=pd.DataFrame({'A': [0.62435, 0.542345,0.213452], 'B': [0.22426,0.15779,0.30346]}) #function for round to near 0.5: def custom_round(x, base=5): return base * round(float(x)/base) table['A'] = table['A'].astype(float).round(2).apply(lambda x: custom_round(x, base=.05)) table['B'] = table['B'].astype(float).round(2).apply(lambda x: custom_round(x, base=.05)) table >>> A B 0 0.60 0.20 1 0.55 0.15 2 0.20 0.30 </code></pre> but on my table I get in the end: <img src="https://i.stack.imgur.com/WBrgC.png" alt="enter image description here"> When I run the script without the function to round near 0.5, I still get the two digits: <pre class="prettyprint"><code>table['B'] = table['B'].round(2) </code></pre> <img src="https://i.stack.imgur.com/5K1gE.png" alt="enter image description here"> My question is why is this hapenning? and how can I fix it in order to round both columns to 0.05 and get both digits appear? edit: I have been asked how do I apply it on my real table , so: <pre class="prettyprint"><code>df['A'] = df['A'].astype(float).round(2).apply(lambda x: custom_round(x, base=.05)) df['B']= df['B'].round(2).apply(lambda x: custom_round(x, base=.05)) </code></pre>

Your numbers are rounded correctly. Below I will explain, <ol> <li>How to show 2 digits precision?</li> <li>What was happening with the example data?</li> </ol> <h3>1. How to show 2 digits precision?</h3> If you really want just to show two digits, you can skip the rounding function (<code>custom_round</code>) altogether, and just run this* before printing your dataframes: <pre class="prettyprint lang-py prettyprint-override"><code>pd.options.display.float_format = '{:,.2f}'.format </code></pre> This will make the float valued data to be printed with 2 digits precision. Example: <pre class="prettyprint lang-py prettyprint-override"><code>table=pd.DataFrame({'A': [0.62435, 0.542345,0.213452], 'B': [0.22426,0.18779,0.30346]}) In [1]: table Out[1]: A B 0 0.62 0.22 1 0.54 0.19 2 0.21 0.30 </code></pre> <h3>2. What is happening with the example data?</h3> <ul> <li>Using the same data as given in the question</li> </ul> <pre class="prettyprint lang-py prettyprint-override"><code>table=pd.DataFrame({'A': [0.62435, 0.542345,0.213452], 'B': [0.22426,0.15779,0.30346]}) # execute code with custom_round in the question In [1]: table Out[1]: A B 0 0.60 0.20 1 0.55 0.15 2 0.20 0.30 </code></pre> <ul> <li>Setting the middle value of B to 0.18779 (rounded to 0.20)</li> </ul> <pre class="prettyprint lang-py prettyprint-override"><code>table=pd.DataFrame({'A': [0.62435, 0.542345,0.213452], 'B': [0.22426,0.18779,0.30346]}) # execute code with custom_round in the question In [1]: table Out[1]: A B 0 0.60 0.2 1 0.55 0.2 2 0.20 0.3 </code></pre> <h3>Why does this happen?</h3> Internally, the number is rounded into two digit precision. When you print the table to the console / Jupyter notebook, pandas skips printing of the last value (2nd digit) if they are all zeroes. So, the data is two digits precision (for example, 0.20), but it is just shown with one digit precision, since 0.20 = 0.2. <hr> * You may also use other printing scheme: The <code>pd.options.display.float_format</code> can be set to any callable that <blockquote> [...] accept a floating point number and return a string with the desired format of the number. This is used in some places like SeriesFormatter. See core.format.EngFormatter for an example. </blockquote>

In your second screenshot the second value in column B is 0.22 which is correctly rounded then to 0.2. All values in the second screenshot round to 0.x0. So the missing last digit is a feature from the GUI, suppressing a trailing 0. The error is likely not in the rounding to 0.05. It is before that. It appears as if the rounding to two digits using round(2) is not applied to the input in your example (the second value in B in your example is 0.15779.

Pandas has this thing which removes trailing zeros for digits after trailing zeros. I guess its sort of a feature or a bug. If you just want to see the output to the right precision on your display/print, have you tried the display_precison option, like pd.set_option('precision', 2) Or change 2 to 3 or 4 to play around. I think this is globally display precision option though, so if you want to display different precision for different column, that will be a problem.

Round near 0.05 remove one digit from the results

Tags:

python

rounding

pandas

I have pandas table with two columns with numerical data (dtype flaot64). I have rounded each column to have 2 digits after the decimal point and then used function to round it to the near 0.5 but for some reason only one column got rounded with 0.05 and the second one got rounded but missed the 2nd digit.

This is fake example which works and show the flow :

table=pd.DataFrame({'A': [0.62435, 0.542345,0.213452],
                   'B': [0.22426,0.15779,0.30346]})

#function for round to near 0.5:
def custom_round(x, base=5):
    return base * round(float(x)/base)

table['A'] = table['A'].astype(float).round(2).apply(lambda x: custom_round(x, base=.05))
table['B'] = table['B'].astype(float).round(2).apply(lambda x: custom_round(x, base=.05))
table

>>>

A   B
0   0.60    0.20
1   0.55    0.15
2   0.20    0.30

but on my table I get in the end:

enter image description here

When I run the script without the function to round near 0.5, I still get the two digits:

table['B'] = table['B'].round(2)

enter image description here

My question is why is this hapenning? and how can I fix it in order to round both columns to 0.05 and get both digits appear?

edit: I have been asked how do I apply it on my real table , so:

df['A'] = df['A'].astype(float).round(2).apply(lambda x: custom_round(x, base=.05))
df['B']= df['B'].round(2).apply(lambda x: custom_round(x, base=.05))

319

asked Jul 09 '20 08:07

Reut

3 Answers

Your numbers are rounded correctly. Below I will explain,

How to show 2 digits precision?
What was happening with the example data?

1. How to show 2 digits precision?

If you really want just to show two digits, you can skip the rounding function (custom_round) altogether, and just run this* before printing your dataframes:

pd.options.display.float_format = '{:,.2f}'.format

This will make the float valued data to be printed with 2 digits precision. Example:

table=pd.DataFrame({'A': [0.62435, 0.542345,0.213452],
                   'B': [0.22426,0.18779,0.30346]})
In [1]: table
Out[1]:
     A    B
0 0.62 0.22
1 0.54 0.19
2 0.21 0.30

2. What is happening with the example data?

Using the same data as given in the question

table=pd.DataFrame({'A': [0.62435, 0.542345,0.213452],
                   'B': [0.22426,0.15779,0.30346]})

# execute code with custom_round in the question

In [1]: table
Out[1]:
      A     B
0  0.60  0.20
1  0.55  0.15
2  0.20  0.30

Setting the middle value of B to 0.18779 (rounded to 0.20)

table=pd.DataFrame({'A': [0.62435, 0.542345,0.213452],
                   'B': [0.22426,0.18779,0.30346]})

# execute code with custom_round in the question

In [1]: table
Out[1]:
      A    B
0  0.60  0.2
1  0.55  0.2
2  0.20  0.3

Why does this happen?

Internally, the number is rounded into two digit precision. When you print the table to the console / Jupyter notebook, pandas skips printing of the last value (2nd digit) if they are all zeroes. So, the data is two digits precision (for example, 0.20), but it is just shown with one digit precision, since 0.20 = 0.2.

* You may also use other printing scheme: The pd.options.display.float_format can be set to any callable that

[...] accept a floating point number and return a string with the desired format of the number. This is used in some places like SeriesFormatter. See core.format.EngFormatter for an example.

answered Oct 22 '22 09:10

np8

In your second screenshot the second value in column B is 0.22 which is correctly rounded then to 0.2. All values in the second screenshot round to 0.x0. So the missing last digit is a feature from the GUI, suppressing a trailing 0.

The error is likely not in the rounding to 0.05. It is before that.

It appears as if the rounding to two digits using round(2) is not applied to the input in your example (the second value in B in your example is 0.15779.

answered Oct 22 '22 07:10

Christian Fries

Pandas has this thing which removes trailing zeros for digits after trailing zeros. I guess its sort of a feature or a bug. If you just want to see the output to the right precision on your display/print, have you tried the display_precison option, like

pd.set_option('precision', 2)

Or change 2 to 3 or 4 to play around. I think this is globally display precision option though, so if you want to display different precision for different column, that will be a problem.

answered Oct 22 '22 08:10

Jim

Related questions
                            
                                How to set the Jinja environment variable in Flask?
                            
                                Topic modeling on short texts Python
                            
                                python multiprocessing : AttributeError: Can't pickle local object
                            
                                How to pass variable to JSON, for python?
                            
                                message.content.startswith Discord.Py
                            
                                Prunning model doesn't improve inference speed or reduce model size
                            
                                Get local time zone name on Windows (Python 3.9 zoneinfo)
                            
                                while loop requires a specific order to work?
                            
                                Correlation coefficient explanation--Feature Selection
                            
                                Download dependencies declared in pyproject.toml using Pip
                            
                                .flaskenv or .env file not being read
                            
                                Python Asyncio errors: "OSError: [WinError 6] The handle is invalid" and "RuntimeError: Event loop is closed" [duplicate]
                            
                                Tensorflow error in Colab - ValueError: Shapes (None, 1) and (None, 10) are incompatible
                            
                                How to specify return value of mocked function with pytest-mock?
                            
                                Why does pandas use "NaN" from numpy, instead of its own null value?
                            
                                The Run button in VS Code don't show up [Python]
                            
                                How to download a file from Google Cloud Platform storage
                            
                                Using fillna with two multi-index dataframes throws InvalidIndexError
                            
                                How to hide axis lines but show ticks in a chart in Altair, while actively using "axis" parameter?
                            
                                Convert fancy/artistic unicode text to ASCII

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With