So, I'm trying to create a tree-type variable that I could use for data navigation. I've ran into an issue while trying to use reference variables on hash tables in PowerShell. Consider the following code:
$Tree = @{ TextValue = "main"; Children = @() }
$Item = @{ TextValue = "sub"; Children = @() }
$Pointer = [ref] $Tree.Children
$Pointer.Value += $Item
$Tree
When checking reference variable $Pointer
, it shows appropriate values, but main variable $Tree
is not affected. Is there no way to create references to a hash table element in PowerShell, and I'll have to switch to a 2-dimensional array?
Edit with more info:
I've accepted Mathias' answer, as using List
looks like exactly what I need, but there's a little more clarity needed on how arrays and references interact. Try this code:
$Tree1 = @()
$Pointer = $Tree1
$Pointer += 1
Write-Host "tree1 is " $Tree1
$Tree2 = @()
$Pointer = [ref] $Tree2
$Pointer.Value += 1
Write-Host "tree2 is " $Tree2
As you can see from the output, it is possible to get a reference to an array and then modify the size of the array via that reference. I thought it would also work if an array is an element of another array or a hash table, but it does not. PowerShell seems to handle those differently.
I suspect this to be an unfortunate side-effect of the way +=
works on arrays.
When you use +=
on a fixed-size array, PowerShell replaces the original array with a new (and bigger) array. We can verify that $Pointer.Value
no longer references the same array with GetHashCode()
:
PS C:\> $Tree = @{ Children = @() }
PS C:\> $Pointer = [ref]$Tree.Children
PS C:\> $Tree.Children.GetHashCode() -eq $Pointer.Value.GetHashCode()
True
PS C:\> $Pointer.Value += "Anything"
PS C:\> $Tree.Children.GetHashCode() -eq $Pointer.Value.GetHashCode()
False
One way of going about this is to avoid using @()
and +=
.
You could use a List
type instead:
$Tree = @{ TextValue = "main"; Children = New-Object System.Collections.Generic.List[psobject] }
$Item = @{ TextValue = "sub"; Children = New-Object System.Collections.Generic.List[psobject] }
$Pointer = [ref] $Tree.Children
$Pointer.Value.Add($Item)
$Tree
To complement Mathias R. Jessen's helpful answer:
Indeed, any array is of fixed size and cannot be extended in place (@()
creates an empty [object[]]
array).
+=
in PowerShell quietly creates a new array, with a copy of all the original elements plus the new one(s), and assigns that to the LHS.
Your use of [ref]
is pointless, because $Pointer = $Tree.Children
alone is sufficient to copy the reference to the array stored in $Tree.Children
.
See bottom section for a discussion of appropriate uses of [ref]
.
Thus, both $Tree.Children
and $Pointer
would then contain a reference to the same array, just as $Pointer.Value
does in your [ref]
-based approach.
Because +=
creates a new array, however, whatever is on the LHS - be it $Pointer.Value
or, without [ref]
, just $Pointer
- simply receives a new reference to the new array, whereas $Tree.Children
still points to the old one.
You can verify this by using the direct way to determine whether two variables or expressions "point" to the same instance of a reference type (which all collections are):
PS> [object]::ReferenceEquals($Pointer.Value, $Tree.Children)
False
Note that [object]::ReferenceEquals()
is only applicable to reference types, not value types - variables containing the latter store values directly instead of referencing data stored elsewhere.
Mathias' approach solves your problem by using a [List`1]
instance instead of an array, which can be extended in place with its .Add()
method, so that the reference stored in $Pointer[.Value]
never needs to change and continues to refer to the same list as $Tree.Children
.
Regarding your follow-up question: appropriate uses of [ref]
:
$Tree2 = @()
$Pointer = [ref] $Tree2
In this case, because [ref]
is applied to a variable - as designed - it creates an effective variable alias: $Pointer.Value
keeps pointing to whatever $Tree2
contains even if different data is assigned to $Tree2
later (irrespective of whether that data is a value-type or reference-type instance):
PS> $Tree2 = 'Now I am a string.'; $Pointer.Value
Now I am a string.
Also note that the typical [ref]
use case is to pass variables to functions to .NET API methods that have ref
or out
parameters; while you can use it with PowerShell scripts and functions too in order to pass by-reference parameters, as shown in the following example, this is best avoided:
# Works, but best avoided in PowerShell code.
PS> function foo { param([ref] $vRef) ++$vRef.Value }; $v=1; foo ([ref] $v); $v
2 # value of $v was incremented via $vRef.Value
By contrast, you cannot use [ref]
to create such a persistent indirect reference to data, such as the property of an object contained in a variable, and use of [ref]
is essentially pointless there:
$Tree2 = @{ prop = 'initial val' }
$Pointer = [ref] $Tree2.prop # [ref] is pointless here
Later changing $Tree2.prop
is not reflected in $Pointer.Value
, because $Pointer.Value
statically refers to the reference originally stored in $Tree2.prop
:
PS> $Tree2.prop = 'later val'; $Pointer.Value
initial val # $Pointer.Value still points to the *original* data
PowerShell should arguably prevent use of [ref]
with anything that is not a variable. However, there is a legitimate - albeit exotic - "off-label" use for [ref]
, for facilitating updating values in the caller's scope from descendant scopes, as shown in the conceptual about_Ref help topic.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With