I found out today that an arraylist I passed to a function gets changed when I remove a value from the arraylist within the function. The code below seems to imply that passing is happening by reference. Why would that be? Is this by design or some kind of bug? (I am using v4 on Win 8.1)
function myfunction {
param (
[System.Collections.ArrayList]$local
)
"`$local: " + $local.count
"removing 1 from `$local"
$local.RemoveAt(0)
"`$local:" + $local.count
}
[System.Collections.ArrayList]$names=(Get-Content c:\temp\names.txt)
"`$names: " + $names.count
myfunction -local $names
"`$names: " + $names.count
RESULT:
$names: 16
$local: 16
removing 1 from $local
$local:15
$names: 15
PowerShell arguments may be passed by reference using the Ref keyword. By default, PowerShell arguments are passed by position. Parameter names may be used to identify parameters, bypassing position. The $args array variable may be used to access a variable length parameter list.
Adding items to a large array can be quite slow, a PowerShell array variable is immutable - meaning that in the background it creates a whole new array that includes the new value and then discards the old array.
What is PowerShell Arraylist. We can use an ArrayList to store a list of items in PowerShell. Unlike array, arraylist's length is not fixed, it can changed. One difference between array and ArrayList is, An array is strongly types, that means array can store only specific type elements.
What is @() in PowerShell Script? In PowerShell, the array subexpression operator “@()” is used to create an array. To do that, the array sub-expression operator takes the statements within the parentheses and produces the array of objects depending upon the statements specified in it.
This is by design, and is not a bug. Arrays, collections and hash tables are passed by ref. The reason this behaves differently than adding or removing from an array is that operation creates a new array inside the function scope. Any time you create a new variable inside the function, it is scoped to the function. $local.RemoveAt(0) doesn't create a new $local, it just calls a function of the existing $local in the parent script. If you want the function to operate on it's own $local, you need to explicitly create a new one inside the function.
Because it's by ref, this won't work:
$local = $local
You'll still be referencing $local in the parent scope. But you can use the clone() method to create a new copy of it
function testlocal {
param ([collections.arraylist]$local)
$local = $local.Clone()
$local.RemoveAt(0)
$local
}
$local = [collections.arraylist](1,2,3)
'Testing function arraylist'
testlocal $local
''
'Testing local arraylist'
$local
Testing function arraylist
2
3
Testing local arraylist
1
2
3
mjolinor's helpful answer provides the crucial pointer: To have the function operate on a copy of the input ArrayList, it must be cloned via .Clone()
first.
Unfortunately, the explanation offered there for why this is required is not correct:[1]
No PowerShell-specific variable behavior comes into play; the behavior is fundamental to the .NET framework itself, which underlies PowerShell:
Variables are technically passed by value (by default[2]), but what that means depends on the variable value's type:
Therefore, in the case at hand, because [System.Collections.ArrayList]
is a reference type (verify with -not [System.Collections.ArrayList].IsValueType
), parameter $local
by design points to the very same ArrayList instance as variable $names
in the calling scope.
Unfortunately, PowerShell can obscure what's happening by cloning objects behind the scenes with certain operations:
Using +=
to append to an array ([System.Object[]]
):
$a = 1, 2, 3 # creates an instance of reference type [Object[]]
$b = $a # $b and $a now point to the SAME array
$a += 4 # creates a NEW instance; $a now points to a DIFFERENT array.
Using +=
to append to a [System.Collections.ArrayList]
instance:
While in the case of an array ([System.Object[]
) a new instance must be created - because arrays are by definition of fixed size - PowerShell unfortunately quietly converts a [System.Collections.ArrayList]
instance to an array when using +=
and therefore obviously also creates a new object, even though [System.Collections.ArrayList]
can be grown, namely with the .Add()
method.
$al = [Collections.ArrayList] @(1, 2, 3) # creates an ArrayList
$b = $al # $b and $al now point to the SAME ArrayList
$al += 4 # !! creates a NEW object of type [Object[]]
# By contrast, this would NOT happen with: $al.Add(4)
Destructuring an array:
$a = 1, 2, 3 # creates an instance of reference type [Object[]]
$first, $a = $a # creates a NEW instance
[1] mjolinor's misconception is around inheriting / shadowing of variables from the parent (ancestral) scope: A parameter declaration is implicitly a local variable declaration. That is, on entering testlocal()
$local
is already a local variable containing whatever was passed as the parameter - it never sees an ancestral variable of the same name. The following snippet demonstrates this: function foo([string] $local) { "`$local inside foo: $local" }; $local = 'hi'; "`$local in calling scope: $local"; foo; foo 'bar'
- foo()
never sees the calling scope's definition of $local
.
[2] Note that some .NET languages (e.g., ref
in C#) and even PowerShell itself ([ref]
) also allow passing a variable by reference, so that the local parameter is effectively just an alias for the calling scope's variable, but this feature is unrelated to the value/reference-type dichotomy.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With