Slow performance of Perl script using nested for loops

Question

I have a large FASTA file (a genetic sequence, an entire chromosome), where each line contains 50 characters (bases a,g,t, and c). There are about 4 million lines in this file.

I want to reorganize the file so that each character of a line is placed in its own line of a new file. That is, turn each 50-character line in the original file into 50, single-character lines. This will result in the entire sequence rewritten as a single column. Ultimately, I want the sequence as a single column so I can then place an adjacent column containing the genomic coordinate position for each base.

This is how I am doing it, using perl and creating a set of for loops.

unless(@ARGV) {
    # $0 name of the program being executed;
    print "
 usage: $0 filename

"; 
    exit;
}

# use shift to pull off @ARGV value and return to $list;
my $fastafile = shift; 
open(FASTA, "<$fastafile");
my @count =(<FASTA>);
close FASTA;

# print scalar @count;

for ( my $i = 0; $i < scalar @count ; $i ++ ) {

#print "$count[$i]



"; 
my @seq  = split( "", $count[ $i ] ); 
print " line = $i ";
for ( my $j = 0; $j < scalar @seq; $j++ ){
    #my $count =
    print "$seq[$j]  for count = $j 
"; 

    }

}

It seems to be working, but it is being slow, very slow. I am wondering if it is slow because the FASTA file has 4 million lines, or it is slow because of my code, or both. I am looking for advice to speed up this process. Thanks!

Alan Haggai Alavi · Accepted Answer

The problem is that you are slurping the file. While the huge file is being slurped, the process will wait until all the I/O is over to start processing. An option is to process the file line by line:

open my $fh, '<', $fastafile or die "Error opening file: $!";

while ( my $line = <$fh> ) {
    chomp $line;    # Remove the newline from the end of each line

    my @seq = split //, $line;

    # Loop from 0 to the last index of @seq
    for my $i ( 0 .. $#seq ) {
        print "$seq[$i] for count = $i
";
    }
}

Slow performance of Perl script using nested for loops

Tags:

performance

for-loop

nested-loops

perl

bioinformatics

ES55

1 Answers

Alan Haggai Alavi

Recent Activity

Donate For Us

Slow performance of Perl script using nested for loops

Tags:

performance

for-loop

nested-loops

perl

bioinformatics

ES55

1 Answers

Alan Haggai Alavi

Related questions

Recent Activity

Donate For Us