Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

SQL Server numbers rounding issue

This is a follow-up question to [TSQL number rounding issue. It's the same code:

IF OBJECT_ID(N'dbo.rounding_testing') IS NOT NULL
    DROP FUNCTION dbo.rounding_testing;
GO
CREATE FUNCTION dbo.rounding_testing
(
    @value FLOAT,
    @digit INT
)
RETURNS FLOAT
BEGIN
    DECLARE
        @factor FLOAT,
        @result FLOAT;
    SELECT @factor = POWER(10, @digit);

    SELECT @result = FLOOR(@value * @factor + 0.4);

    RETURN @result;
END;
GO

SELECT dbo.rounding_testing(5.7456, 3);
SELECT FLOOR(5.7456 * 1000 + 0.4);

When you execute the code you will get:

5745
5746

However, when you change the data type from float to real in the function like this:

IF OBJECT_ID(N'dbo.rounding_testing') IS NOT NULL
    DROP FUNCTION dbo.rounding_testing;
GO
CREATE FUNCTION dbo.rounding_testing
(
    @value REAL,
    @digit INT
)
RETURNS REAL
BEGIN
    DECLARE
        @factor REAL,
        @result REAL;
    SELECT @factor = POWER(10, @digit);
    SELECT @result = FLOOR(@value * @factor + 0.4);
    RETURN @result;
END;
GO

SELECT dbo.rounding_testing(5.7456, 3);
SELECT FLOOR(5.7456 * 1000 + 0.4);

You will get this when executed:

5746
5746

With regard to the two answers under that question, I did some more testing and found myself still not clear. First I'd like to say I've read the msdn documents about float and real types and numeric and decimal types. And I know how SQL Server stores them internally now. For float and real types, IEEE 754 standard is used. For decimal and numeric types, see How does SQL Server store decimal type values internally?. I want to know which EXACT step caused the precision loss in the float case. So I created a table like this:

USE tempdb;
GO

IF OBJECT_ID('dbo.mytable') IS NOT NULL
    DROP TABLE dbo.mytable;
CREATE TABLE dbo.mytable
(
    a NUMERIC(5, 4),
    b FLOAT,
    c FLOAT,
    d FLOAT,
    e FLOAT,
    f REAL,
    g REAL,
    h REAL,
    i REAL
);
GO

Than I manually insert the intermediate data into this table.

INSERT INTO dbo.mytable
VALUES(
    5.7456,
    CAST(5.7456 AS FLOAT),
    CAST(POWER(10, 3) AS FLOAT),
    CAST(CAST(5.7456 AS FLOAT) * CAST(POWER(10, 3) AS FLOAT) AS FLOAT),
    CAST(CAST(5.7456 AS FLOAT) * CAST(POWER(10, 3) AS FLOAT) + 0.4 AS FLOAT),
    CAST(5.7456 AS REAL),
    CAST(POWER(10, 3) AS REAL),
    CAST(CAST(5.7456 AS REAL) * CAST(POWER(10, 3) AS REAL) AS REAL),
    CAST(CAST(5.7456 AS REAL) * CAST(POWER(10, 3) AS REAL) + 0.4 AS REAL));

After that I use DBCC PAGE to investigate the row I inserted. Below is the raw data of the row:

0000000000000000:   10003900 0170e000 002497ff 907efb16 40000000  ..9..pà..$ÿ.~û.@...
0000000000000014:   0000408f 40999999 999971b6 40ffffff ffff71b6  ..@[email protected]¶@ÿÿÿÿÿq¶
0000000000000028:   40f5dbb7 4000007a 44cd8cb3 450090b3 45090000  @õÛ·@..zDͳE..³E  ..
000000000000003C:   00                                            .      

This is the interpretation of the raw data:

Column Stuff inserted                                                          Hex (little endian)     Interpretation
------ ----------------------------------------------------------------------- ----------------------- --------------                                                                                  
a      5.7456                                                                  01 70 E0 00 00          Decimal 57456, the decimal point position is stored in catalog view                                                                    
b      CAST(5.7456 AS FLOAT)                                                   24 97 FF 90 7E FB 16 40 IEEE 754 double precision format, 5.7456                                                                             
c      CAST(POWER(10, 3) AS FLOAT)                                             00 00 00 00 00 40 8F 40 IEEE 754 double precision format, 1000                                                                           
d      CAST(CAST(5.7456 AS FLOAT) * CAST(POWER(10, 3) AS FLOAT) AS FLOAT)      99 99 99 99 99 71 B6 40 IEEE 754 double precision format, 5745.6                                                                             
e      CAST(CAST(5.7456 AS FLOAT) * CAST(POWER(10, 3) AS FLOAT) + 0.4 AS FLOAT)FF FF FF FF FF 71 B6 40 IEEE 754 double precision format, 5746                                                                             
f      CAST(5.7456 AS REAL)                                                    F5 DB B7 40             IEEE 754 single precision format, 5.7456                                                                 
g      CAST(POWER(10, 3) AS REAL)                                              00 00 7A 44             IEEE 754 single precision format, 1000                                                                   
h      CAST(CAST(5.7456 AS REAL) * CAST(POWER(10, 3) AS REAL) AS REAL)         CD 8C B3 45             IEEE 754 single precision format, 5745.6                            
i      CAST(CAST(5.7456 AS REAL) * CAST(POWER(10, 3) AS REAL) + 0.4 AS REAL))  00 90 B3 45             IEEE 754 single precision format, 5746                            

From the hex interpretation, it seems to me there is no precision loss in any of the steps, no matter it's float or real. So where exactly comes from the precision loss?

like image 672
Just a learner Avatar asked Nov 08 '22 04:11

Just a learner


1 Answers

The closest real (single-precision) value to 5.7456 is hex 40b7dbf5 which is 5.745600223541259765625 in decimal.

The closest float (double-precision) value to 5.7456 is hex 4016fb7e90ff9724 which is 5.745599999999999596411726088263094425201416015625 in decimal.

(Use my floating-point converter to verify those: enter 5.7456 and check the "Double" and "Single" precision boxes and select the "Decimal" and "Raw hexadecimal" output boxes.)

You can see that the double-precision value is less than 5.7456, which is the root of your problem (that is, why you get 5745 as your answer).

The calculation 5.7456 * 1000 is 5745.60009765625 in single-precision and 5745.5999999999994543031789362430572509765625 in double-precision.

0.4 is 0.4000000059604644775390625 in single-precision and 0.40000000000000002220446049250313080847263336181640625 in double-precision.

5.7456 * 1000 + 0.4 is 5746 in single-precision and 5745.9999999999990905052982270717620849609375 in double-precision.

(I used a C program to do those calculations.)

So the difference is due to a combination of how the values were converted and calculations were rounded in the two precisions.

(You said "From the hex interpretation, it seems to me there is no precision loss in any of the steps"...I don't know what you meant by that.)

like image 50
Rick Regan Avatar answered Nov 15 '22 07:11

Rick Regan