I've understood that with begin/process/end the process section runs mutiple times, for each object in the pipeline. So if I have a function like this:
function Test-BeginProcessEnd {
[cmdletbinding()]
Param(
[Parameter(Mandatory=$true, ValueFromPipeline=$True)]
[string]$myName
)
begin {}
process {
Write-Host $myName
}
end {}
}
I can pipe an array to it, like this, and it processes each object:
PS C:\> @('aaa','bbb') | Test-BeginProcessEnd
aaa
bbb
PS C:\>
But if I try to use the parameter in the command line, I can only pass it 1 string, so I can do:
PS C:\> Test-BeginProcessEnd -myName 'aaa'
aaa
PS C:\>
But I can't do:
PS C:\> Test-BeginProcessEnd -myName @('aaa','bbb')
Test-BeginProcessEnd : Cannot process argument transformation on parameter 'myName'. Cannot convert value to type
System.String.
At line:1 char:30
+ Test-BeginProcessEnd -myName @('aaa','bbb')
+ ~~~~~~~~~~~~~~
+ CategoryInfo : InvalidData: (:) [Test-BeginProcessEnd], ParameterBindingArgumentTransformationException
+ FullyQualifiedErrorId : ParameterArgumentTransformationError,Test-BeginProcessEnd
PS C:\>
Obviously I want the parameter usage to be the same as via the pipeline, so I have to change the function to be:
function Test-BeginProcessEnd
{
[cmdletbinding()]
Param(
[Parameter(Mandatory=$true, ValueFromPipeline=$True)]
[string[]]$myNames
)
begin {}
process {
foreach ($name in $myNames) {
Write-Host $name
}
}
end {}
}
So I've had to use foreach anyway, and the looping functionality of the Process section hasn't helped me.
Have I missed something? I can't see what it's good for! Thanks for any help.
tl;dr:
Because of how binding pipeline input to parameters works in PowerShell (see below), defining a parameter that accepts pipeline input as well as direct parameter-value passing of arrays:
process
block
Defining your pipeline-binding parameters as a scalar avoids this awkwardness, but passing multiple inputs is then limited to the pipeline - you won't be able to pass arrays as a parameter argument.[1]
This asymmetry is perhaps surprising.
When you define a parameter that accepts pipeline input, you get implicit array logic for free:
With pipeline input, PowerShell calls your process
block once for each input object, with the current input object bound to the parameter variable.
By contrast, passing input as a parameter value only ever enters the process
once, with the input as a whole bound to your parameter variable.
The above applies whether or not your parameter is array-valued: each pipeline input object individually is bound / coerced to the parameter's type exactly as declared.
To put this in concrete terms with your example function that declares parameter [Parameter(Mandatory=$true, ValueFromPipeline=$True)] [string[]] $myNames
:
Let's assume an input array (collection) of 'foo', 'bar'
(note that the @()
around array literals is normally not necessary).
Parameter-value input, Test-BeginProcessEnd -myNames 'foo', 'bar'
:
process
block is called once,'foo', 'bar'
bound to $myNames
as a whole.Pipeline input, 'foo', 'bar' | Test-BeginProcessEnd
:
process
block is called twice,'foo'
and 'bar'
each coerced to [string[]]
- i.e., a single-element array.To see it in action:
function Test-BeginProcessEnd
{
[cmdletbinding()]
Param(
[Parameter(Mandatory, ValueFromPipeline)]
[string[]]$myNames
)
begin {}
process {
Write-Verbose -Verbose "in process block: `$myNames element count: $($myNames.Count)"
foreach ($name in $myNames) { $name }
}
end {}
}
# Input via parameter
> Test-BeginProcessEnd 'foo', 'bar'
VERBOSE: in process block: $myNames element count: 2
foo
bar
# Input via pipeline
> 'foo', 'bar' | Test-BeginProcessEnd
VERBOSE: in process block: $myNames element count: 1
foo
VERBOSE: in process block: $myNames element count: 1
bar
begin
, process
, end
blocks may be used in a function whether or not it is an advanced function (cmdlet-like - see below).
process
block invocations.| Select-Object -First 1
, which efficiently exits the pipeline after the desired number of objects have been received.process
block and use $Input | Select-Object 1
inside your function, but, as stated, that will collect all input in memory first; another - also imperfect - alternative can be found in this answer of mine.If you do not use these blocks, you can still optionally access pipeline input via the automatic $Input
variable; note, however, that your function then runs after ALL pipeline input has been collected in memory (not object by object as with a process
block).
Generally, though, it pays to use a process
block:
process
block is an implicit loop over all pipeline input, and you can selectively perform initialization and cleanup tasks in the begin
and end
blocks, respectively.It is easy to turn a function into an advanced function, however, which offers benefits with respect to supporting common parameters such as -ErrorAction
, and -OutVariable
as well as detection of unrecognized parameters:
param()
block to declare the parameters and decorate that block with the [CmdletBinding()]
attribute, as shown above (also, decorating an individual parameter with a [Parameter()]
attribute implicitly makes a function an advanced one, but for clarity it's better to use [CmdletBinding()]
explicitly).[1] Strictly speaking, you can, but only if you type your parameter [object]
(or don't specify a type at all, which is the same).
However, the input array/collection is then bound as a whole to the parameter variable, and the process
block is still only entered once, where you'd need to perform your own enumeration.
Some standard cmdlets, such as Export-Csv
, are defined this way, yet they do not enumerate a collection passed via the -InputObject
parameter, making direct use of that parameter effectively useless - see this GitHub issue.
The BEGIN
-PROCESS
-END
structure is used for scripts/advanced functions where (a) you want to be able to pipe data to it, and (b) there is stuff that you want to do before (BEGIN
) and/or after (END
) processing the entire set of data (as opposed to before or after each individual item that comes through the pipe). If you pass a single value to an advanced function that uses the foreach
to be able to handle an array, it treats the single value as an array of one item; the pipe does this, in effect - except that with pipe, it doesn't need to reload the cmdlet for each item. This is, ultimately, why you can write scripts/advanced functions that can be used either in the pipeline or as 'standalone' processes. It is not that PROCESS
causes the looping; it's that it enables the efficient processing of values coming in from the pipeline. If you want to handle multiple values passed to it by other than the pipeline, you need to manage the looping yourself - as you've discovered.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With