Let's say I have 3 double-precision arrays,
real*8, dimension(n) :: x, y, z
which are initialized as
x = 1.
y = (/ (1., i=1,n) /)
z = (/ (1. +0*i, i=1,n) /)
They should initialize all elements of all arrays to 1
. In ifort
(16.0.0 20150815), this works as intended for any n
within the range of the declared precision. That is, if we initialize n
as
integer*4, parameter :: n
then as long as n < 2147483647
, the initialization works as intended for all declarations.
In gfortran
(4.8.5 20150623 Red Hat 4.8.5-16), the initialization fails for y
(array comprehension with constant argument) as long as n>65535
, independent of its precision. AFAIK, 65535
is the maximum of a unsigned short int
, aka unsigned int*2
which is well within the range of integer*4
.
Below is an MWE:
program test
implicit none
integer*4, parameter :: n = 65536
integer*4, parameter :: m = 65535
real*8, dimension(n) :: x, y, z
real*8, dimension(m) :: a, b, c
integer*4 :: i
print *, huge(n)
x = 1.
y = (/ (1., i=1,n) /)
z = (/ (1.+0*i, i=1,n) /)
print *, x(n), y(n), z(n)
a = 1.
b = (/ (1., i=1,m) /)
c = (/ (1.+0*i, i=1,m) /)
print *, a(m), c(m), c(m)
end program test
Compiling with gfortran
(gfortran test.f90 -o gfortran_test
), it outputs:
2147483647
1.0000000000000000 0.0000000000000000 1.0000000000000000
1.0000000000000000 1.0000000000000000 1.0000000000000000
Compiling with ifort
(ifort test.f90 -o ifort_test
), it outputs:
2147483647
1.00000000000000 1.00000000000000 1.00000000000000
1.00000000000000 1.00000000000000 1.00000000000000
What gives?
There is indeed a big difference in how the compiler treats the array constructors. For n<=65535
there is the actual array of [1., 1., 1.,...] stored in the object file (or in some of the intermediate representations).
For a larger array the compiler generates a loop:
(*(real(kind=8)[65536] * restrict) atmp.0.data)[offset.1] = 1.0e+0;
offset.1 = offset.1 + 1;
{
integer(kind=8) S.2;
S.2 = 0;
while (1)
{
if (S.2 > 65535) goto L.1;
y[S.2] = (*(real(kind=8)[65536] * restrict) atmp.0.data)[S.2];
S.2 = S.2 + 1;
}
L.1:;
}
it appears to me, that first it sets only one element of a temporary array and then it copies the (mostly undefined) temporary array to y
. And that is wrong. Valgrind also reports usage of uninitialized memory.
For a default real we have
while (1)
{
if (shadow_loopvar.2 > 65536) goto L.1;
(*(real(kind=4)[65536] * restrict) atmp.0.data)[offset.1] = 1.0e+0;
offset.1 = offset.1 + 1;
shadow_loopvar.2 = shadow_loopvar.2 + 1;
}
L.1:;
{
integer(kind=8) S.3;
S.3 = 0;
while (1)
{
if (S.3 > 65535) goto L.2;
y[S.3] = (*(real(kind=4)[65536] * restrict) atmp.0.data)[S.3];
S.3 = S.3 + 1;
}
L.2:;
}
We have two loops now, one sets the whole temporary array and the second one copies that to y
and everything is fine.
The issue was fixed by GCC developers who read this question. The bug is tracked at https://gcc.gnu.org/bugzilla/show_bug.cgi?id=84931
They also identified that the problem is connected to type conversion. The constructor has default precision 1.
and with single precision array there is no type conversion, but for a double precision array there is some type conversion. That caused the difference for these two cases.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With