I have two text files that contain columnar data of the variety position
-value
, sorted by position
.
Here is an example of the first file (file A
):
100 1
101 1
102 0
103 2
104 1
...
Here is an example of the second file (B
):
20 0
21 0
...
100 2
101 1
192 3
193 1
...
Instead of reading one of the two files into a hash table, which is prohibitive due to memory constraints, what I would like to do is walk through two files simultaneously, in a stepwise fashion.
What this means is that I would like to stream through lines of either A
or B
and compare position
values.
If the two positions are equal, then I perform a calculation on the values associated with that position.
Otherwise, if the positions are not equal, I move through lines of file A
or file B
until the positions are equal (when I again perform my calculation) or I reach EOF of both files.
Is there a way to do this in Perl?
Looks like a problem one would likely stumble upon, for example database table data with keys and values. Here's an implementation of the pseudocode provided by rjp.
#!/usr/bin/perl
use strict;
use warnings;
sub read_file_line {
my $fh = shift;
if ($fh and my $line = <$fh>) {
chomp $line;
return [ split(/\t/, $line) ];
}
return;
}
sub compute {
# do something with the 2 values
}
open(my $f1, "file1");
open(my $f2, "file2");
my $pair1 = read_file_line($f1);
my $pair2 = read_file_line($f2);
while ($pair1 and $pair2) {
if ($pair1->[0] < $pair2->[0]) {
$pair1 = read_file_line($f1);
} elsif ($pair2->[0] < $pair1->[0]) {
$pair2 = read_file_line($f2);
} else {
compute($pair1->[1], $pair2->[1]);
$pair1 = read_file_line($f1);
$pair2 = read_file_line($f2);
}
}
close($f1);
close($f2);
Hope this helps!
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With