I have a fixed-sized array where the size of the array is always in factor of 3.
my @array = ('foo', 'bar', 'qux', 'foo1', 'bar', 'qux2', 3, 4, 5);
How can I cluster the member of array such that we can get an array of array group by 3:
$VAR = [ ['foo','bar','qux'],
['foo1','bar','qux2'],
[3, 4, 5] ];
The array_chunk() function splits an array into chunks of new arrays.
In Perl, the splice() function is used to remove and return a certain number of elements from an array. A list of elements can be inserted in place of the removed elements. Syntax: splice(@array, offset, length, replacement_list) Parameters: @array – The array in consideration.
I really like List::MoreUtils and use it frequently. However, I have never liked the natatime
function. It doesn't produce output that can be used with a for loop or map
or grep
.
I like to chain map/grep/apply operations in my code. Once you understand how these functions work, they can be very expressive and very powerful.
But it is easy to make a function to work like natatime that returns a list of array refs.
sub group_by ($@) {
my $n = shift;
my @array = @_;
croak "group_by count argument must be a non-zero positive integer"
unless $n > 0 and int($n) == $n;
my @groups;
push @groups, [ splice @array, 0, $n ] while @array;
return @groups;
}
Now you can do things like this:
my @grouped = map [ reverse @$_ ],
group_by 3, @array;
** Update re Chris Lutz's suggestions **
Chris, I can see merit in your suggested addition of a code ref to the interface. That way a map-like behavior is built in.
# equivalent to my map/group_by above
group_by { [ reverse @_ ] } 3, @array;
This is nice and concise. But to keep the nice {}
code ref semantics, we have put the count argument 3
in a hard to see spot.
I think I like things better as I wrote it originally.
A chained map isn't that much more verbose than what we get with the extended API. With the original approach a grep or other similar function can be used without having to reimplement it.
For example, if the code ref is added to the API, then you have to do:
my @result = group_by { $_[0] =~ /foo/ ? [@_] : () } 3, @array;
to get the equivalent of:
my @result = grep $_->[0] =~ /foo/,
group_by 3, @array;
Since I suggested this for the sake of easy chaining, I like the original better.
Of course, it would be easy to allow either form:
sub _copy_to_ref { [ @_ ] }
sub group_by ($@) {
my $code = \&_copy_to_ref;
my $n = shift;
if( reftype $n eq 'CODE' ) {
$code = $n;
$n = shift;
}
my @array = @_;
croak "group_by count argument must be a non-zero positive integer"
unless $n > 0 and int($n) == $n;
my @groups;
push @groups, $code->(splice @array, 0, $n) while @array;
return @groups;
}
Now either form should work (untested). I'm not sure whether I like the original API, or this one with the built in map capabilities better.
Thoughts anyone?
** Updated again **
Chris is correct to point out that the optional code ref version would force users to do:
group_by sub { foo }, 3, @array;
Which is not so nice, and violates expectations. Since there is no way to have a flexible prototype (that I know of), that puts the kibosh on the extended API, and I'd stick with the original.
On a side note, I started with an anonymous sub in the alternate API, but I changed it to a named sub because I was subtly bothered by how the code looked. No real good reason, just an intuitive reaction. I don't know if it matters either way.
my @VAR;
push @VAR, [ splice @array, 0, 3 ] while @array;
or you could use natatime
from List::MoreUtils
use List::MoreUtils qw(natatime);
my @VAR;
{
my $iter = natatime 3, @array;
while( my @tmp = $iter->() ){
push @VAR, \@tmp;
}
}
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With