Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Is Arraylist passed to functions by reference in PowerShell

I found out today that an arraylist I passed to a function gets changed when I remove a value from the arraylist within the function. The code below seems to imply that passing is happening by reference. Why would that be? Is this by design or some kind of bug? (I am using v4 on Win 8.1)

function myfunction {
    param (
        [System.Collections.ArrayList]$local
    )
        "`$local: " + $local.count
        "removing 1 from `$local"
        $local.RemoveAt(0)     
        "`$local:" + $local.count       
}

[System.Collections.ArrayList]$names=(Get-Content c:\temp\names.txt)

"`$names: " + $names.count
 myfunction -local $names      
"`$names: " + $names.count

RESULT:

$names: 16
$local: 16
removing 1 from $local
$local:15
$names: 15
like image 391
Adil Hindistan Avatar asked Jan 25 '14 03:01

Adil Hindistan


People also ask

Is PowerShell pass by reference?

PowerShell arguments may be passed by reference using the Ref keyword. By default, PowerShell arguments are passed by position. Parameter names may be used to identify parameters, bypassing position. The $args array variable may be used to access a variable length parameter list.

Are PowerShell arrays immutable?

Adding items to a large array can be quite slow, a PowerShell array variable is immutable - meaning that in the background it creates a whole new array that includes the new value and then discards the old array.

What is ArrayList in PowerShell?

What is PowerShell Arraylist. We can use an ArrayList to store a list of items in PowerShell. Unlike array, arraylist's length is not fixed, it can changed. One difference between array and ArrayList is, An array is strongly types, that means array can store only specific type elements.

What is @() in PowerShell?

What is @() in PowerShell Script? In PowerShell, the array subexpression operator “@()” is used to create an array. To do that, the array sub-expression operator takes the statements within the parentheses and produces the array of objects depending upon the statements specified in it.


2 Answers

This is by design, and is not a bug. Arrays, collections and hash tables are passed by ref. The reason this behaves differently than adding or removing from an array is that operation creates a new array inside the function scope. Any time you create a new variable inside the function, it is scoped to the function. $local.RemoveAt(0) doesn't create a new $local, it just calls a function of the existing $local in the parent script. If you want the function to operate on it's own $local, you need to explicitly create a new one inside the function.

Because it's by ref, this won't work:

 $local = $local

You'll still be referencing $local in the parent scope. But you can use the clone() method to create a new copy of it

  function testlocal {
   param ([collections.arraylist]$local)
   $local = $local.Clone()
   $local.RemoveAt(0)
   $local
 }

$local = [collections.arraylist](1,2,3)

'Testing function arraylist'    
testlocal $local
''
'Testing local arraylist'
$local


Testing function arraylist
2
3

Testing local arraylist
1
2
3
like image 126
mjolinor Avatar answered Oct 01 '22 16:10

mjolinor


mjolinor's helpful answer provides the crucial pointer: To have the function operate on a copy of the input ArrayList, it must be cloned via .Clone() first.

Unfortunately, the explanation offered there for why this is required is not correct:[1]

No PowerShell-specific variable behavior comes into play; the behavior is fundamental to the .NET framework itself, which underlies PowerShell:

Variables are technically passed by value (by default[2]), but what that means depends on the variable value's type:

  • For value types, for which variables contain the data directly, a copy of the actual data is made.
  • For reference types, for which variables only contain a reference to the data, a copy of the reference is made, resulting in effective by-reference passing.

Therefore, in the case at hand, because [System.Collections.ArrayList] is a reference type (verify with -not [System.Collections.ArrayList].IsValueType), parameter $local by design points to the very same ArrayList instance as variable $names in the calling scope.

Unfortunately, PowerShell can obscure what's happening by cloning objects behind the scenes with certain operations:

  • Using += to append to an array ([System.Object[]]):

     $a = 1, 2, 3  # creates an instance of reference type [Object[]]
     $b = $a       # $b and $a now point to the SAME array
     $a += 4       # creates a NEW instance; $a now points to a DIFFERENT array.
    
  • Using += to append to a [System.Collections.ArrayList] instance:

    • While in the case of an array ([System.Object[]) a new instance must be created - because arrays are by definition of fixed size - PowerShell unfortunately quietly converts a [System.Collections.ArrayList] instance to an array when using += and therefore obviously also creates a new object, even though [System.Collections.ArrayList] can be grown, namely with the .Add() method.

      $al = [Collections.ArrayList] @(1, 2, 3)  # creates an ArrayList
      $b = $al       # $b and $al now point to the SAME ArrayList
      $al += 4       # !! creates a NEW object of type [Object[]]
      # By contrast, this would NOT happen with: $al.Add(4)
      
  • Destructuring an array:

     $a = 1, 2, 3     # creates an instance of reference type [Object[]]
     $first, $a = $a  # creates a NEW instance
    

[1] mjolinor's misconception is around inheriting / shadowing of variables from the parent (ancestral) scope: A parameter declaration is implicitly a local variable declaration. That is, on entering testlocal() $local is already a local variable containing whatever was passed as the parameter - it never sees an ancestral variable of the same name. The following snippet demonstrates this: function foo([string] $local) { "`$local inside foo: $local" }; $local = 'hi'; "`$local in calling scope: $local"; foo; foo 'bar' - foo() never sees the calling scope's definition of $local.

[2] Note that some .NET languages (e.g., ref in C#) and even PowerShell itself ([ref]) also allow passing a variable by reference, so that the local parameter is effectively just an alias for the calling scope's variable, but this feature is unrelated to the value/reference-type dichotomy.

like image 45
mklement0 Avatar answered Oct 01 '22 15:10

mklement0