What code is generated for Ada Array of Records loop?

Question

For example for:

type PERSONCV is
   record
      name: String ( 1..4 );
      age: Integer;
      cvtext: String ( 1..2000 );
   end record;

N: constant := 40000;
persons : array ( 1..N ) of PERSONCV;

function jobcandidate return Boolean is
   iscandidate: Boolean := False;
begin
   for p of persons loop   -- what code is generated for this?
      if p.age >= 18 then
         iscandidate := true;
         exit;
      end if;
   end loop;
   return iscandidate;
end;

In C/C++ the loop part would typically be:

PERSONCV * p; // address pointer
int k = 0;
while ( k < N )
   {
   p = &persons [ k ]; // pointer to k'th record
   if ( p->age >= 18 )...  
   ...
   k++ ;
   }

I have read that Ada uses Value semantics for records. Does the Ada loop above copy the k'th record to loop variable p? e.g. like this is in C/C++ :

PERSONCV p; // object/variable
int k = 0;
while ( k < N )
   {
   memcpy ( &p, &persons [ k ], sizeof ( PERSONCV ) ); // copies k'th elem 
   if ( p.age >= 18 )...  
   ...
   k++ ;
   }

Simon Wright · Accepted Answer

Assuming you are using GNAT, there are two avenues of investigation.

The switch -gnatG will regenerate an Ada-like representation of what the front end of the compiler is going to pass to the back end (before any optimisations). In this case, I see

   function resander__jobcandidate return boolean is
      iscandidate : boolean := false;
      L_1 : label
   begin
      L_1 : for C8b in 1 .. 40000 loop
         p : resander__personcv renames resander__persons (C8b);
         if p.age >= 18 then
            iscandidate := true;
            exit;
         end if;
      end loop L_1;
      return iscandidate;
   end resander__jobcandidate;

so the question is, how does renames get translated? Given that the record size is 2008 bytes, the chances of the compiler generating a copy is pretty much zero.

The second investigatory approach is to keep the assembly code that the compiler normally emits to the assembler and then deletes, using the switch -S. This confirms that the code generated is like your first C++ version (for macOS).

As an interesting sidelight, Ada 2012 allows an alternate implementation of jobcandidate:

   function jobcandidate2 return Boolean is
     (for some p of persons => p.age >= 18);

which generates identical code.

user_1818839 · Answer

I suspect that what you have read about Ada is wrong, and probably worse, is encouraging you to think about Ada in the wrong way.

Ada's intent is to encourage thinking in the problem domain, i.e. to specify what should happen, rather than thinking in the solution domain, i.e. to implement the fine details of exactly how.

So here the intent is to loop over all Persons, exit returning True on meeting the first over 18, otherwise return False.

And that's it.

By and large Ada mandates nothing about the details of how it's done, provided those semantics are satisfied.

Then, the intent is, you just expect the compiler to do the right thing.

Now an individual compiler may choose one implementation over another - or may switch between implementations according to optimisation heuristics, taking into account which CPU it's compiling for, as well as the size of the objects (will they fit into a register?) etc.

You could imagine a CPU with many registers, where a single cache line read makes the copy implementation faster than operating in place (especially if there are no modifications to write back to P's contents), or other target CPUs where the reverse was true. Why would you want to stop the compiler picking the better implementation?

A good example of this is Ada's approach to parameter passing to subprograms - name, value or reference semantics really don't apply - instead, you specify the parameter passing mode - in, out, or in out describing the information flow to (or from) the subprogram. Intuitive, provides semantics that can be more rigorously checked, and leaves the compiler free to pick the best (fastest, smallest, depending on your goal) implementation that correctly obeys those semantics.

Now it would be possible for a specific Ada compiler to make poor choices, and 30 years ago when computers were barely big enough to run an Ada compiler at all, you might well have found performance compromised for simplicity in early releases of a compiler.

But we have thirty more years of compiler development now, running on more powerful computers. So, today, I'd expect the compiler to normally make the best choice. And if you find a specific compiler missing out on performance optimisations, file an enhancement request. Ada compilers aren't perfect, just like any other compiler.

In this specific example, I'd normally expect P to be a cursor into the array, and operations to happen in-place, i.e. reference semantics. Or possibly a hybrid between forms, where one memory fetch into a register serves several operations, like a partial form of value semantics.

If your interest is academic, you can easily look at the assembly output from whatever compiler you're using and find out. Or write all three versions above and benchmark them.

What code is generated for Ada Array of Records loop?

Tags:

ada

resander

2 Answers

Simon Wright

user_1818839

Recent Activity

Donate For Us

What code is generated for Ada Array of Records loop?

Tags:

ada

resander

2 Answers

Simon Wright

user_1818839

Related questions

Recent Activity

Donate For Us