Array of arrays in Perl

Question

I'm new to Perl, but I needed it to get some text out of some awful HTML file. In the code so far, I have got to the point I have extracted all the values I need (I verified it works with data dumper):

For every data record i.e. row of a 2D table they are called:

$org, $gene_name, $number, $motif_num, $pos, $strand, $seq

I have many data entries and each one would be a row, with the above values as the columns.

To do other stuff with them later, I want to make a 2D array structure, so I can loop through each entry (row) and pick out values I need and so on.

I thought the best way of doing this would to use the loop and for each data entry, after extracting the values with regexp matching, combine the values/columns into an array for the individual data record:

my @seidl_array_row = ($org, $gene_name, $number, $motif_num, $pos, $strand, $seq);

Then push this array onto the finished 2D array of arrays:

push @seidl_array, [ @seidl_array_row ];

(@seidl_array was defined with my before the loop.)

So in effect I get a 2D data table, where each element of the array @seidl_array is an array containing the values $org, $gene_name, $number, $motif_num, $pos, $strand, and $seq.

I'm new to Perl, so I don't know if this was the right way to do it programmatically, since I'm having issues when it comes to doing stuff later with this data. I wondered if the issue was with how I constructed the array of arrays in the first place. Examples in my book do it statically with simple data sets, and this is a much larger genomic data gtf file, so doing it statically is not really feasible.

tauli · Accepted Answer

As far as I can see, there is nothing wrong with your approach. Using a reference to the array instead of copying the array, as choroba suggested, has the benefit that the data isn't copied unnecessarily (but remember: that only works if you declare @seidl_array_row inside the loop, otherwise you would just make several references to the same array).

You can have that same advantage by skipping the row array completely like so:

push @seidl_array, [ $org, $gene_name, $number, $motif_num, $pos, $strand, $seq ];

For some extra convenience in accessing the data, I often use arrays of hashes like so:

push @seidl_array, {
    org    => $org,
    name   => $gene_name,
    number => $number,
    motif  => $motif_num,
    pos    => $pos,
    strand => $strand,
    seq    => $seq,
};

This has the advantage that you don't have to remember the positions of the respective values in the array, but can access them by name.

choroba · Answer

Your solution seems correct to me. Using [ @seidl_array_row ] creates a copy of the list, if you are correctly declaring the row with my inside the loop, you can store its reference directly to avoid unnecessary copying:

push @seidl_array, \@seidl_array_row;

Array of arrays in Perl

Tags:

multidimensional-array

perl

Ward9250

2 Answers

tauli

choroba

Recent Activity

Donate For Us

Array of arrays in Perl

Tags:

multidimensional-array

perl

Ward9250

2 Answers

tauli

choroba

Related questions

Recent Activity

Donate For Us