Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Array Types In Powershell - System.Object[] vs. arrays with specific types

Why does caling GetType().Name on an array of strings return Object[] and not String[]? This seems to happen with any element type, for example Import-Csv will give you an Object[] but each element is a PSCustomObject.

Here's an example with an array of String

$x = @('a','b','c')

$x[0].GetType().Name #String
$x.GetType().Name #Object[]
like image 343
David Klempfner Avatar asked Feb 20 '17 22:02

David Klempfner


People also ask

What is an array type variable in PowerShell?

PowerShell Arrays An array is a type of a variable. It is a set of components (array elements) arranged in a certain order. Elements of the array are numbered sequentially, and you access an element using its index number.

How do I compare two arrays in PowerShell?

You can also use PowerShell to compare arrays using the Compare-Object cmdlet. This cmdlet takes a reference object and a difference object and returns a side indicator indicating which elements are and are not in either array. You can see below that the Compare-Object cmdlet allows you to compare both arrays at once.

What is the type of array of objects?

An array of Objects is used to store a fixed-size sequential collection of elements of the same type. TypeScript Arrays are themselves a data type just like a string, Boolean, and number, we know that there are a lot of ways to declare the arrays in TypeScript.

Does PowerShell have arrays?

PowerShell ArraysArrays in PowerShell can contain one or more items. An item can be a string, an integer, an object, or even another array, and one array can contain any combination of these items. Each of these items has an index, which always starts (sometimes confusingly) at 0.


1 Answers

Tip of the hat to PetSerAl for all his help.

To complement Miroslav Adamec's helpful answer with why PowerShell creates System.Object[] arrays by default and additional background information:

PowerShell's default arrays are meant to be flexible:

  • they allow you to store objects of any type (including $null),
  • even allowing you to mix objects of different types in a single array.

To enable this, the array must be (implicitly) typed as [object[]] ([System.Object[]]), because System.Object is the single root of the entire .NET type hierarchy from which all other types derive.

For instance, the following creates an [object[]] array whose elements are of type [string], [int], [datetime], and $null, respectively.

$arr = 'hi', 42, (Get-Date), $null  # @(...) is not needed; `, <val>` for a 1-elem. arr.

When you:

  • create an array by using the array construction operator, ,

  • force command output into an array by using the array subexpression operator, @(...)

  • save to a variable the output from a command that emits a collection of objects with 2 or more elements, irrespective of the specific type of the original collection, or operate on it in the context of another command by enclosing it in (...)

you always get a System.Object[] array - even if all the elements happen to have the same type, as in your example.


Optional Further Reading

PowerShell's default arrays are convenient, but have drawbacks:

  • They provide no type safety: if you want to ensure that all elements are of a specific type (or should be converted to it, if possible), a default array won't do; e.g.:

      $intArray = 1, 2      # An array of [int] values.
      $intArray[0] = 'one'  # !! Works, because a [System.Object[]] array can hold any type.
    
  • [System.Object[]] arrays are inefficient for value types such as [int], because boxing and unboxing must be performed - though that may often not matter in the real world.

Since PowerShell provides access to the .NET type system, you can avoid the drawbacks if you create an array that is restricted to the specific type of interest, using a cast or type-constrained variable:

[int[]] $intArray = 1, 2  # A type-constrained array of [int] variable.
$intArray[0] = 'one'      # BREAKS: 'one' can't be converted to an [int]

Note that using a cast to create the array - $intArray = [int[]] (1, 2) - would have worked too, but only the type-constrained variable ensures that you cannot later assign a value of a different type to the variable (e.g., $intArray = 'one', 'two' would fail).

Syntax pitfall with casts: [int[]] 1, 2 does not work as intended, because casts have high operator precedence, so the expression is evaluated as ([int[]] 1), 2, which creates a regular [object[]] array whose first element is a nested [int[]] array with single element 1.
When in doubt, use @(...) around your array elements[1], which is also required if you want to ensure that an expression that may return only a single item is always treated as an array.


Pitfalls

PowerShell performs many type conversions behind the scenes, which are typically very helpful, but there are pitfalls:

  • PowerShell automatically tries to coerce a value to a target type, which you don't always want and may not notice:

      [string[]] $a = 'one', 'two'
      $a[0] = 1    # [int] 1 is quietly coerced to [string]
    
      # The coercion happens even if you use a cast:
      [string[]] $a = 'one', 'two'
      $a[0] = [int] 1    # Quiet coercion to [string] still happens.
    

    Note: That even an explicit cast - [int] 1 - causes quiet coercion may or may not be a surprise to you. My surprise came from - incorrectly - assuming that in an auto-coercing language such as PowerShell casts might be a way to bypass the coercion - which is not true.[2]

    Given that any type can be converted to a string, a [string[]] array is the trickiest case.
    You do get an error if (automatic) coercion cannot be performed, such as with
    [int[]] $arr = 1, 2; $arr[0] = 'one' # error

  • "Adding to" a specifically-typed array creates a new array of type [object[]]:

    PowerShell conveniently allows you to "add to" arrays with the + operator.
    In reality, a new array is created behind the scenes with the additional element(s) appended, but that new array is by default again of type [object[]], irrespective of the type of the input array:

      $intArray = [int[]] (1, 2)
      ($intArray + 4).GetType().Name # !! -> 'Object[]'
      $intArray += 3 # !! $intArray is now of type [object[]]
    
      # To avoid the problem...
      # ... use casting:
      ([int[]] ($intArray + 4)).GetType().Name # -> 'Int32[]'
      # ... or use a type-constrained variable:
      [int[]] $intArray = (1, 2) # a type-constrained variable
      $intArray += 3 # still of type [int[]], due to type constraint.
    
  • Outputting to the success stream converts any collection to [object[]]:

    Any collection with at least 2 elements that a command or pipeline outputs (to the success stream) is automatically converted to an array of type [object[]], which may be unexpected:

      # A specifically-typed array:
      # Note that whether or not `return` is used makes no difference.
      function foo { return [int[]] (1, 2) }
      # Important: foo inside (...) is a *command*, not an *expression*
      # and therefore a *pipeline* (of length 1)
      (foo).GetType().Name # !! -> 'Object[]'
    
      # A different collection type:
      function foo { return [System.Collections.ArrayList] (1, 2) }
      (foo).GetType().Name # !! -> 'Object[]'
    
      # Ditto with a multi-segment pipeline:
      ([System.Collections.ArrayList] (1, 2) | Write-Output).GetType().Name # !! -> 'Object[]'
    

    The reason for this behavior is that PowerShell is fundamentally collection-based: any command's output is sent item by item through the pipeline; note that even a single command is a pipeline (of length 1).

    That is, PowerShell always first unwraps collections, and then, if needed, reassembles them - for assignment to a variable, or as the intermediate result of a command nested inside (...) - and the reassembled collection is always of type [object[]].

    PowerShell considers an object a collection if its type implements the IEnumerable interface, except if it also implements the IDictionary interface.
    This exception means that PowerShell's hashtables ([hashtable]) and ordered hashtables (the PSv3+ literal variant with ordered keys, [ordered] @{...}, which is of type [System.Collections.Specialized.OrderedDictionary]) are sent through the pipeline as a whole, and to instead enumerate their entries (key-value pairs) individually, you must call their .GetEnumerator() method.

  • PowerShell by design always unwraps a single-element output collection to that single element:

    In other words: when a single-element collection is output, PowerShell doesn't return an array, but the array's single element itself.

      # The examples use single-element array ,1 
      # constructed with the unary form of array-construction operator ","
      # (Alternatively, @( 1 ) could be used in this case.)
    
      # Function call:
      function foo { ,1 }
      (foo).GetType().Name # -> 'Int32'; single-element array was *unwrapped*
    
      # Pipeline:
      ( ,1 | Write-Output ).GetType().Name # -> 'Int32'
    
      # To force an expression into an array, use @(...):
      @( (,1) | Write-Output ).GetType().Name # -> 'Object[]' - result is array
    

    Loosely speaking, the purpose of array subexpression operator @(...) is: Always treat the enclosed value as a collection, even if it contains (or would normally unwrap to) only a single item:
    If it is a single value, wrap it an [object[]] array with 1 element.
    Values that already are collections remain collections, though they are converted to a new [object[]] array, even if the value itself already is an array:
    $a1 = 1, 2; $a2 = @( $a1 ); [object]::ReferenceEquals($a1, $a2)
    outputs $false, proving that arrays $a1 and $a2 are not the same.

    Contrast this with:

    • just (...), which does not per se change the value's type - its purpose is merely to clarify precedence or to force a new parsing context:

      • If the enclosed construct is an expression (something parsed in expression mode), the type is not changed; e.g., ([System.Collections.ArrayList] (1, 2)) -is [System.Collections.ArrayList] and ([int[]] (1,2)) -is [int[]] both return $true - the type is retained.

      • If the enclosed construct is a command (single- or multi-segment pipeline), then the default unwrapping behavior applies; e.g.:
        (&{ , 1 }) -is [int] returns $true (the single-element array was unwrapped) and (& { [int[]] (1, 2) }) -is [object[]] (the [int[]] array was reassembled into an [object[]] array) both return $true, because the use of call operator & made the enclosed construct a command.

    • (regular) subexpression operator $(...), typically used in expandable strings, which exhibits the default unwrapping behavior: $(,1) -is [int] and $([System.Collections.ArrayList] (1, 2)) -is [object[]] both return $true.

  • Returning a collection as a whole from a function or script:

    On occasion you may want to output a collection as a whole, i.e., to output it as a single item, retaining its original type.

    As we've seen above, outputting a collection as-is causes PowerShell to unwrap it and ultimately reassemble it into a regular [object[]] array.

    To prevent that, the unary form of array construction operator , can be used to wrap the collection in an outer array, which PowerShell then unwraps to the original collection:

      # Wrap array list in regular array with leading ","
      function foo { , [System.Collections.ArrayList] (1, 2) }
      # The call to foo unwraps the outer array and assigns the original
      # array list to $arrayList.
      $arrayList = foo
      # Test
      $arrayList.GetType().Name # -> 'ArrayList'
    

    In PSv4+, use Write-Output -NoEnumerate:

      function foo { write-output -NoEnumerate ([System.Collections.ArrayList] (1, 2)) }
      $arrayList = foo
      $arrayList.GetType().Name # -> 'ArrayList'
    

[1] Note that using @(...) to create array literals isn't necessary, because the array-construction operator , alone creates arrays.
On versions prior to PSv5.1, you also pay a (in most cases probably negligible) performance penalty, because the ,-constructed array inside @() is effectively cloned by @() - see this answer of mine for details.
That said, @(...) has advantages:

  • You can use the same syntax, whether your array literal contains a single (@( 1 ) or multiple elements (@( 1, 2 )). Contrast this with just using ,: 1, 2 vs. , 1.
  • You needn't ,-separate the lines of a multiline @(...) statements (though note that each line then technically becomes its own statement).
  • There are no operator-precedence pitfalls, because $(...) and @(...) have the highest precedence.

[2] PetSerAl provides this advanced code snippet to show the limited scenarios in which PowerShell does respect casts, namely in the context of overload resolution for .NET method calls:

# Define a simple type that implements an interface
# and a method that has 2 overloads.
Add-Type '
  public interface I { string M(); }
  public class C : I {
           string I.M()       { return "I.M()"; } // interface implementation
    public string M(int i)    { return "C.M(int)"; } 
    public string M(object o) { return "C.M(object)"; } 
  }
'
# Instantiate the type and use casts to distinguish between
# the type and its interface, and to target a specific overload.
$C = New-Object C
$C.M(1)           # default: argument type selects overload -> 'C.M(int)' 
([I]$C).M()       # interface cast is respected -> 'I.M()'
$C.M([object]1)   # argument cast is respected -> 'C.M(object)'
like image 115
mklement0 Avatar answered Sep 21 '22 05:09

mklement0