Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Extended double precision

Is it possible to obtain more than 16 digits with double precision without using quadruple? If it is possible, does it depend on compiler or something else? Because I know someone said he was working with double precision and had 22 digit precision.

like image 931
Shibli Avatar asked Feb 09 '12 12:02

Shibli


People also ask

Is extended precision the same as double precision?

In many cases the format of the extended precision is not quite the same as a scale-up of the ordinary single- and double-precision formats it is meant to extend.

What is double precision in computer memory?

Double Precision is also a format given by IEEE for representation of floating-point number. It occupies 64 bits in computer memory. In single precision, 32 bits are used to represent floating-point number. In double precision, 64 bits are used to represent floating-point number. It uses 8 bits for exponent. It uses 11 bits for exponent.

What is extended precision in C?

Extended precision refers to floating-point number formats that provide greater precision than the basic floating-point formats. [1] Extended precision formats support a basic format by minimizing roundoff and overflow errors in intermediate values of expressions on the base format.

What is the difference between single precision and double precision mantissa?

In single precision, 23 bits are used for mantissa. In double precision, 52 bits are used for mantissa. Bias number is 127. Bias number is 1023. This is used where precision matters less. This is used where precision matters more.


1 Answers

The data type double precision stems from Fortran 77, and the only requirement for that type is that is has more precision than real. You shouldn't use that any more.

In Fortran 90/95 and beyond, at least two sizes of real numbers are supported. The precision is determined by the kind parameter, of which the value depends on the compiler.

real(kind=8) :: a, b

To have a portable way of defining precision, you can obtain a kind value that allows a certain precision by using:

integer, parameter :: long_double = SELECTED_REAL_KIND(22)

then you can declare your variables as

real(kind=long_double) :: a, b

but it is not certain your compiler will support that precision, in which case the SELECTED_REAL_KIND function will return a negative number.

see also this post

like image 90
steabert Avatar answered Sep 26 '22 12:09

steabert