I'm writing a function Chunk-Object
that can chunk an array of objects into sub arrays. For example, if I pass it an array @(1, 2, 3, 4, 5)
and specify 2
elements per chunk, then it will return 3 arrays @(1, 2)
, @(3, 4)
and @(5)
. Also the user can provide an optional scriptblock
parameter if they want to process each elements before chunk them into sub arrays. Now my code is:
function Chunk-Object()
{
[CmdletBinding()]
Param(
[Parameter(Mandatory = $true,
ValueFromPipeline = $true,
ValueFromPipelineByPropertyName = $true)] [object[]] $InputObject,
[Parameter()] [scriptblock] $Process,
[Parameter()] [int] $ElementsPerChunk
)
Begin {
$cache = @();
$index = 0;
}
Process {
foreach($o in $InputObject) {
$current_element = $o;
if($Process) {
$current_element = & $Process $current_element;
}
if($cache.Length -eq $ElementsPerChunk) {
,$cache;
$cache = @($current_element);
$index = 1;
}
else {
$cache += $current_element;
$index++;
}
}
}
End {
if($cache) {
,$cache;
}
}
}
(Chunk-Object -InputObject (echo 1 2 3 4 5 6 7) -Process {$_ + 100} -ElementsPerChunk 3)
Write-Host "------------------------------------------------"
(echo 1 2 3 4 5 6 7 | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3)
The result is:
PS C:\Users\a> C:\Untitled5.ps1
100
100
100
100
100
100
100
------------------------------------------------
101
102
103
104
105
106
107
PS C:\Users\a>
As you can see, it works with piped in objects, but does not work with values get from parameter. How to modify the code to make it work in both cases?
The difference is that when you pipe the array into Chunk-Object, the function executes the process block once for each element in the array passed as a sequence of pipeline objects, whereas when you pass the array as an argument to the -InputObject parameter, the process block executes once for the entire array, which is assigned as a whole to $InputObject.
So, let's take a look at your pipeline version of the command:
echo 1 2 3 4 5 6 7 | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3
The reason this one works is that for each iteration of the pipeline, $_ is set to the value of the current array element in the pipeline, which is also assigned to the $InputObject variable (as a single-element array, due to the [object[]]
typecast. The foreach loop is actually extraneous in this case, because the $InputObject array always has a single element for each invocation of the process block. You could actually remove the loop and change $current_element = $o
to $current_element = $InputObject
, and you'd get the exact same results.
Now, let's examine the version that passes an array argument to -InputObject:
Chunk-Object -InputObject (echo 1 2 3 4 5 6 7) -Process {$_ + 100} -ElementsPerChunk 3
The reason this doesn't work is that the scriptblock you're passing to the -Process parameter contains $_, but the foreach loop assigns each element to $o, and $_ isn't defined anywhere. All elements in the results are 100 because each iteration sets $current_element to the results of the scriptblock {$_ + 100}
, which always evaluates to 100 when $_ is null. To prove this out, try changing $_ in the scriptblock to $o, and you'll get the expected results:
Chunk-Object -InputObject (echo 1 2 3 4 5 6 7) -Process {$o + 100} -ElementsPerChunk 3
If you want to be able to use $_ in the scriptblock, change the foreach loop to a pipeline, by simply replacing foreach($o in $InputObject) {
with $InputObject | %{
. That way both versions will work, because the Chunk-Object function uses a pipeline internally, so $_ is set sequentially to each element of the array, regardless of whether the process block is invoked multiple times for a series of individual array elements passed in as pipeline input, or just once for a multiple-element array.
UPDATE:
I looked at this again and noticed that in the line
$current_element = & $Process $current_element;
you appear to be trying to pass $current_element as an argument to the scriptblock in $Process. This doesn't work because parameters passed to a scriptblock work largely the same as in functions. If you invoke MyFunction 'foo'
, then 'foo' isn't automatically assigned to $_ within the function; likewise, & {$_ + 100} 'foo'
doesn't set $_ to 'foo'. Change your scriptblock argument to {$args[0] + 100}
, and you'll get the expected results with or without passing in pipeline input:
Chunk-Object -InputObject (echo 1 2 3 4 5 6 7) -Process {$args[0] + 100} -ElementsPerChunk 3
Note that although this version of the scriptblock argument works even if you keep the foreach loop, I'd still recommend using Foreach-Object ($InputObject | %{
), because it's generally more efficient, so the function will run faster for large amounts of data.
The issue isn't technically the parameter attributes. It's both with your arguments, and how you're processing them.
Problem: (echo 1 2 3 4 5 6 7)
creates a string of value "1 2 3 4 5 6 7", you appear to want to process an array
Solution: use an array: @(1, 2, 3, 4, 5, 6, 7)
Problem: You are using a foreach statement. This does batch processing, not pipeline
Solution: Use foreach-object
Process {
$InputObject | Foreach-Object {
...
}
}
foreach($foo in $bar)
will gather all items, then iterate. $list | Foreach-Object { ... }
processes each item separately, allowing the pipeline to continue
Note: If the input is actually a string, you will also have to split the string, and convert each element to an integer; Alternatively, change the argument type to an integer if that is what you expect.
Final answer:
function Chunk-Object()
{
[CmdletBinding()]
Param(
[Parameter(Mandatory = $true,
ValueFromPipeline = $true,
ValueFromPipelineByPropertyName = $true)] [object[]] $InputObject,
[Parameter()] [scriptblock] $Process,
[Parameter()] [int] $ElementsPerChunk
)
Begin {
$cache = @();
$index = 0;
}
Process {
$InputObject | ForEach-Object {
$current_element = $_;
if($Process) {
$current_element = & $Process $current_element;
}
if($cache.Length -eq $ElementsPerChunk) {
,$cache;
$cache = @($current_element);
$index = 1;
}
else {
$cache += $current_element;
$index++;
}
}
}
End {
if($cache) {
,$cache;
}
}
}
Set-PSDebug -Off
Write-Host "Input Object is array"
Chunk-Object -InputObject @(1, 2, 3, 4, 5, 6, 7) -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input Object is on pipeline"
@(1, 2, 3, 4, 5, 6, 7) | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input object is string"
(echo "1 2 3 4 5 6 7") | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input object is split string"
(echo "1 2 3 4 5 6 7") -split ' ' | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input object is int[] converted from split string"
([int[]]("1 2 3 4 5 6 7" -split ' ')) | Chunk-Object -Process {$_ + 100} -ElementsPerChunk 3
Write-Host "------------------------------------------------"
Write-Host "Input object is split and converted"
(echo "1 2 3 4 5 6 7") -split ' ' | Chunk-Object -Process {[int]$_ + 100} -ElementsPerChunk 3
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With