Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Read text file line by line but only specific columns

How do we read a specific file line by line while skipping some columns in it?

For example, I have a text file which has data, sorted out in 5 columns, but I need to read only two columns out of it, they can be first two or any other random combination (I mean, need a solution which would work with any combination of columns like first and third only).

Code something like this

        open(1, file=data_file)
        read (1,*) ! to skip first line, with metadata
        lmax = 0
        do while (.true.)
                ! read column 1 and 3 here, either write
                ! that to an array or just loop through each row
        end do
99      continue        
        close (1)

Any explanation or example would help a lot.

like image 734
Indigo Avatar asked Mar 17 '14 10:03

Indigo


Video Answer


2 Answers

High Performance Mark's answer gives the essentials of simple selective column reading: one still reads the column but transfers it to a then-ignored variable.

To extend that answer, then, consider that we want to read the second and fourth columns of a five-column line:

read(*,*) junk, x, junk, y

The first value is transferred into junk, then the second into x, then the third (replacing the one just acquired a moment ago) into junk and finally the fourth into y. The fifth is ignored because we've run out of input items and the transfer statement terminates (and the next read in a loop will go to the next record).

Of course, this is fine when we know it's those columns we want. Let's generalize to when we don't know in advance:

integer col1, col2   ! The columns we require, defined somehow (assume col1<col2)
<type>, dimension(nrows) :: x, y, junk(3)  ! For the number of rows
integer i

do i=1,nrows
  read(*,*) junk(:col1-1), x(i), junk(:col2-col1-1), y(i)
end do

Here, we transfer a number of values (which may be zero) up to just before the first column of interest, then the value of interest. After that, more to-be-ignored values (possibly zero), then the final value of interest. The rest of the row is skipped.

This is still very basic and avoids many potential complications in requirements. To some extent, it's such a basic approach one may as well just consider:

do i=1,nrows
  read(*,*) allofthem(:5)
  x(i) = allofthem(col1)
  y(i) = allofthem(col2)
end do

(where that variable is a row-by-row temporary) but variety and options are good.

like image 198
francescalus Avatar answered Sep 30 '22 11:09

francescalus


This is very easy. You simply read 5 variables from each line and ignore the ones you have no further use for. Something like

do i = 1, 100
    read(*,*) a(i), b, c(i), d, e
end do

This will overwrite the values in b, d, and e at every iteration.

Incidentally, your line

99 continue

is redundant; it's not used as the closing line for the do loop and you're not branching to it from anywhere else. If you are branching to it from unseen code you could just attach the label 99 to the next line and delete the continue statement. Generally, continue is redundant in modern Fortran; specifically it seems redundant in your code.

like image 33
High Performance Mark Avatar answered Sep 30 '22 11:09

High Performance Mark