Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Fortran: RAM needed for plain arrays vs objects storing the same amount of data

Trying to store some data using dynamic memory allocation in two different ways, I notice a huge difference in RAM requirements which I cannot explain. Some insight would be appreciated.

In the following examples, the goal is to create a database that stores the IDs of the edges connected to a node, in a polygon mesh. However, the nature of the problem is irrelevant.

Case 1, using "plain" arrays:

program memorytest

implicit none

integer, dimension(:, :), allocatable :: node_edges
integer :: i

allocate(node_edges(10, 10000000)) ! 10000000 nodes with 10 edges each
node_edges(:, :) = 0

read *, i ! pause

deallocate(node_edges)

end program memorytest

RAM needed: ~395,500 K

Case 2, using a node type:

program memorytest

implicit none

type node
  integer, dimension(:), allocatable :: edges  
end type

type(node), dimension(:), allocatable :: nodes
integer :: i

allocate(nodes(10000000)) ! 10000000 nodes
do i = 1, 10000000
    allocate(nodes(i)%edges(10)) ! with 10 edges each
end do

do i = 1, 10000000
    nodes(i)%edges(:) = 0
end do

read *, i ! pause    

do i = 1, 10000000
    deallocate(nodes(i)%edges)
end do
deallocate(nodes)

end program memorytest

RAM needed: ~1,060,500 K

For a comparison, I tried equivalent approaches in C++.

Case 1, using "plain" arrays:

#include "stdafx.h"
#include <iostream>

int main()
{
  int** node_edges;
  int i, j;

  node_edges = new int*[10000000]; // 10000000 nodes
  for(i = 0; i < 10000000; i++) node_edges[i] = new int[10]; // with 10 edges each

  for(i = 0; i < 10000000; i++)
    for(j = 0; j < 10; j++) node_edges[i][j] = 0;

  std::cin >> i; // pause

  for(i = 0; i < 10000000; i++) delete [] node_edges[i];
  delete [] node_edges;

    return 0;
}

RAM needed: ~510,000 K

Case 2, using a node class:

#include "stdafx.h"
#include <iostream>

class node
{
  public:
    int* edges;
};

int main()
{
  node* nodes;
  int i, j;

  nodes = new node[10000000]; // 10000000 nodes
  for(i = 0; i < 10000000; i++) nodes[i].edges = new int[10]; // with 10 edges each

  for(i = 0; i < 10000000; i++)
    for(j = 0; j < 10; j++) nodes[i].edges[j] = 0;

  std::cin >> i; // pause

  for(i = 0; i < 10000000; i++) delete [] nodes[i].edges;
  delete [] nodes;

    return 0;
}

RAM needed: ~510,000 K

Development environment used: Intel Visual Fortran Studio XE 2013 and MS Visual C++ 2010 respectively, both producing 32bit executables in the default "Release" mode.

As noticed, C++ uses exactly the same amount of RAM for both approaches. In Fortran, I would have justified some minor difference but that much I cannot explain. To me, this looks like something either to do with Fortran itself, or some Intel fortran compiler flag that I am unaware of.

Any ideas why this happens and / or any suggestions to avoid this excessive RAM requirement in an object oriented approach in Fortran?

Thank you in advance.

like image 467
georg.balafas Avatar asked Nov 26 '13 15:11

georg.balafas


People also ask

How to store data in array in Fortran?

Declaring ArraysArrays are declared with the dimension attribute. The individual elements of arrays are referenced by specifying their subscripts. The first element of an array has a subscript of one. The array numbers contains five real variables –numbers(1), numbers(2), numbers(3), numbers(4), and numbers(5).

What is allocate in Fortran?

The ALLOCATE statement dynamically creates storage for array variables having the ALLOCATABLE or POINTER attribute. If the object of an ALLOCATE statement is a pointer, execution of the ALLOCATE statement causes the pointer to become associated.


1 Answers

One thing to bear in mind is that allocating a two dimensional array and single dimensional arrays are different. For example:

  allocate(node_edges(10, 100)) 

Allocates a single block of memory which can contain a 1000 items.

  allocate(nodes(100)) ! 10000000 nodes
  do i = 1, 100
      allocate(nodes(i)%edges(10)) ! with 10 edges each
  end do

Allocates a single block that can contains 100 items and each of which has 10 sub-items. Same number of items so same memory usage?

No. In the second case you have allocated 100 new arrays. Each one has overhead. In Fortran this can be quite high because it has to keep track of the array dimensions - you may want to take an array section later. This is especially noticeable when the allocation size is small. In this case it is 10 and with the extra array information plus padding it could have doubled the allocated size -- which it has in your case.

like image 156
Rob Avatar answered Nov 03 '22 01:11

Rob