Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Modifying then Slicing 2D Array of Unknown Size in Perl

I know that similar topics have been covered here, but I am running into a problem that I think arises from my misunderstanding how array slices are interpolated in the context of a foreach loop. I can't figure out where I've gone wrong, so I'm looking for some insight.

I have a 2D array with a variable number of rows. For example purposes:

@2DArray = (['str1', 1, 2, 'E', val1, val2, val3]
            ['str2', 3, 4, 'E', val4, val5, val6]
            ['str4', 5, 6, 'F', val7, val8, val9])   

I want to build a new array, with additional columns, that incorporates some of the rows of the original array only if they contain the string 'E' in column 3. Additionally, for the rows I do wish to incorporate in my new array, I only want a subset of the columns and I want that subset in a different order. The end goal is to generate output of the correct format required by downstream scripts.

Here is my attempt to do that:

my $projName = 'test';

my $i = 1;
my @Newarray
my @Newarray_element;
     foreach (@2DArray) {
         if  (${$_}[3] eq 'E') {
             ${$_}[3] = $i; 
             ${$_}[5] = '+'; 
             @Newarray_element = ("$projName$i", @$_[0,1,2,5,3], 'STR', 11, 11);
             $i++;
             push (@Newarray, \@Newarray_element);
         }

         next;
     }

print (join("\t", @$_), "\n") for @Newarray;

However, if I do that, what I get is:

#(original) col nums:      0       1    2    5    3

                  test2    str2    3    4    +    2    STR    11    11
                  test2    str2    3    4    +    2    STR    11    11

I.e., my new array will have a row for every row in the original array with an 'E' in column 3, but each row is populated with the values from the last row to be processed by the loop.

The reason I think the issue has to do with slicing a 2D array in a foreach loop is that I know if I merely loop through a 2D array, find all the rows with an 'E' in column 3, modify some values in other columns for those rows, and then return that to a new array it all works perfectly. That is to say, if I instead do this:

my @Newarray;
my $i = 1;
foreach (@2Darray) {
    if  (${$_}[3] eq "E") {
        ${$_}[3] = $i;
        ${$_}[5] = '+';
        $i++;
        push (@Newarray, \@$_);
    }
    next;   
}
print (join("\t", @$_), "\n") for @Newarray;

I get exactly the output I would expect:

                  *            &
str1    1    2    1    val1    +    val3
str2    3    4    2    val4    +    val6

where the columns marked by * and & are the modified columns 3 and 5. Let the onslaught begin: where did my newbie self go wrong?

like image 536
MCor Avatar asked Oct 05 '22 08:10

MCor


1 Answers

The variable @Newarray_element points to the same memory space throughout your program, so changes you make in one iteration are propagated to previous iterations where you used that variable in an assignment.

Two possible fixes:

One. Change the scope of the variable so that it uses different memory in each iteration. Change

my @Newarray_element;
foreach (@2DArray) {
    ...

to

foreach (@2DArray) {
    my @Newarray_element;
    ...

or even

foreach (@2DArray) {
    ...
    my @Newarray_element = ("$projName$i", @$_[0,1,2,5,3], 'STR', 11, 11);

Two: reuse @Newarray_element but assign a copy of its data to each row of @Newarray. Change

push (@Newarray, \@Newarray_element);

to

push (@Newarray, [ @Newarray_element ]);

This latter call creates and appends a new, anonymous array reference to @Newarray.

like image 171
mob Avatar answered Oct 10 '22 20:10

mob