Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Opening Binary Files in Fortran: Status, Form, Access

I have been working with Fortran for years, but the file I/O is still hazy to me. My understanding of status, form, access, recl is limited, because I only needed certain use-cases in grad school.
I know that Fortran binary files have extra information at the top of the file that describe the size of the file. But that has never been an issue for me before because I have only had to deal with Fortran files in Fortran code, where the extra information is necessary, but invisible.

But how do I open a flat, binary file in Fortran?

In the past, I might open a Fortran binary using Fortran by doing something like this:

open(id,file=file_name,status='old',
     +     form='unformatted',access='direct',recl=4,iostat=ok)
      if (ok .ne. 0) then
        write(1,20) id,ok,file_name
                else
        write(1,21) id,file_name

But how does this change for a flat, binary file that doesn't have the Fortran header information? More importantly, where is a good link to describe these terms in greater detail: status, form, access, recl?

like image 463
john_science Avatar asked Apr 03 '12 17:04

john_science


3 Answers

I hate to do this, but I feel that if I were hoping to find answers in this post, the way forward would not be clear. So here is the way forward.

The Short Version

In Fortran 77/90, to open a standard Fortran binary file you might write:

OPEN (5, FILE="myFile.txt")

But to open a flat, non-Fortran binary file you would have to write something more like this:

OPEN(5, file="myFile.txt", form='unformatted', access='direct', recl=1)

This difference is because Fortran-styled binary files have a 4-byte header and footer around each "record" in the file. These headers/footers describe the size of the data contained in the record. (In the most common case, each binary file you encounter will only have one record.)

The Long Version

Fortran assumes a lot of default open arguments. In fact, our original example can be written in the following verbose form to show all the defaults that were assumed.

OPEN (5, FILE="myFile.txt") 
OPEN (5, FILE="myFile.txt", FORM="FORMATTED", 
     +   ACCESS="SEQUENTIAL", STATUS="UNKNOWN")

Let us look at each argument:

  • FORM defines if a file consists of text (form='formatted') or binary data (form='unformatted').

  • ACCESS defines if you are reading data from the file in order (access='sequential') or in any order you want (access='direct').

  • RECL defines the number of bytes that goes into each record. For instance, recl=1 just says that the record lengths are 1 byte each; perhaps they are 1-byte integers.

  • STATUS defines if the file already exists. The STATUS="UNKNOWN" argument means that the file might not exist yet, but if it doesn't it will be created. If you want to protect against the possibility of writing over an old file use: STATUS="OLD". Similarly, if you know the file doesn't exist yet, you will want to use: STATUS="NEW".

For More Information:

These open statements also have an impact on the read/write/close statements that will follow. In my original post, I needed to know that if you open a direct access file you have to write to a direct access file. (That is, there will be no Fortran headers/footers included in your binary.) However, Fortran’s default functionality is to create sequential access files with Fortran headers and footers included.

For more information on open statements in Fortran 77/90, there are a good resources online:

A nice page by Lin Jinsen of Bishop University (thank you so much).

Slightly more official documentation by IBM for it's compilers.

like image 85
john_science Avatar answered Nov 06 '22 09:11

john_science


One caveat is the record length given in recl defaults to the number of 4-byte words with unformatted records (at least on Intel compilers, use byterecl to specify otherwise), so you may have to specify a compiler option or use recl=1.

As your code stands, using unformatted and direct, all you need to do to ensure you read data properly is to choose an appropriate record length. But, some FORTRAN compilers do not always play nice with unformatted binary files and I would suggest adopting HDF5 going forward.

If available, your compiler may allow recordtype='stream':

open (id, file=file_name, status='old', form='unformatted' &
        , access='stream', iostat=ios)
! read (id, pos=1) someValue
like image 30
user7116 Avatar answered Nov 06 '22 09:11

user7116


You can tell open to use the new Stream IO mode in Fortran 2003 with access='stream'.

like image 41
Kleist Avatar answered Nov 06 '22 09:11

Kleist