Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Unquoted tokens in argument mode involving variable references and subexpressions: why are they sometimes split into multiple arguments?

Note: A summary of this question has since been posted at the PowerShell GitHub repository, since superseded by this more comprehensive issue.

Arguments passed to a command in PowerShell are parsed in argument mode (as opposed to expression mode - see Get-Help about_Parsing).

Conveniently, (double-)quoting arguments that do not contain whitespace or metacharacters is usually optional, even when these arguments involve variable references (e.g. $HOME\sub) or subexpressions (e.g., version=$($PsVersionTable.PsVersion).

For the most part, such unquoted arguments are treated as if they were double-quoted strings, and the usual string-interpolation rules apply (except that metacharacters such as , need escaping).

I've tried to summarize the parsing rules for unquoted tokens in argument mode in this answer, but there are curious edge cases:

Specifically (as of Windows PowerShell v5.1), why is the unquoted argument token in each of the following commands NOT recognized as a single, expandable string, and results in 2 arguments getting passed (with the variable reference / subexpression retaining its type)?

  • $(...) at the start of a token:

    Write-Output $(Get-Date)/today # -> 2 arguments: [datetime] obj. and string '/today'
    
    • Note that the following work as expected:

      • Write-Output $HOME/sub - simple var. reference at the start
      • Write-Output today/$(Get-Date) - subexpression not at the start
  • .$ at the start of a token:

    Write-Output .$HOME  # -> 2 arguments: string '.' and value of $HOME
    
    • Note that the following work as expected:

      • Write-Output /$HOME - different initial char. preceding $
      • Write-Output .-$HOME - initial . not directly followed by $
      • Write-Output a.$HOME - . is not the initial char.

As an aside: As of PowerShell Core v6.0.0-alpha.15, a = following a simple var. reference at the start of a token also seems to break the token into 2 arguments, which does not happen in Windows PowerShell v5.1; e.g., Write-Output $HOME=dir.

Note:

  • I'm primarily looking for a design rationale for the described behavior, or, as the case may be, confirmation that it is a bug. If it's not a bug, I want something to help me conceptualize the behavior, so I can remember it and avoid its pitfalls.

  • All these edge cases can be avoided with explicit double-quoting, which, given the non-obvious behavior above, may be the safest choice to use routinely.


Optional reading: The state of the documentation and design musings

As of this writing, the v5.1 Get-Help about_Parsing page:

  • incompletely describes the rules

  • uses terms that aren't neither defined in the topic nor generally in common use in the world of PowerShell ("expandable string", "value expression" - though one can guess their meaning)

From the linked page (emphasis added):

In argument mode, each value is treated as an expandable string unless it begins with one of the following special characters: dollar sign ($), at sign (@), single quotation mark ('), double quotation mark ("), or an opening parenthesis (().

If preceded by one of these characters, the value is treated as a value expression.

As an aside: A token that starts with " is, of course, by definition, also an expandable string (interpolating string).
Curiously, the conceptual help topic about quoting, Get-Help about_Quoting_Rules, manages to avoid both the terms "expand" and "interpolate".

Note how the passage does not state what happens when (non-meta)characters directly follow a token that starts with these special characters, notably $.

However, the page contains an example that shows that a token that starts with a variable reference is interpreted as an expandable string too:

  • With $a containing 4, Write-Output $a/H evaluates to (single string argument) 4/H.

Note that the passage does imply that variable references / subexpressions in the interior of an unquoted token (that doesn't start with a special char.) are expanded as if inside a double-quoted string ("treated as an expandable string").

If these work:

$a = 4
Write-Output $a/H         # -> '4/H'
Write-Output H/$a         # -> 'H/4'
Write-Output H/$(2 + 2)   # -> 'H/4'

why shouldn't Write-Output $(2 + 2)/H expand to '4/H' too (instead of being treated as 2 arguments?
Why is a subexpression at the start treated differently than a variable reference?

Such subtle distinctions are hard to remember, especially in the absence of a justification.

A rule that would make more sense to me is to unconditionally treat a token that starts with $ and has additional characters following the variable reference / subexpression as an expandable string as well.
(By contrast, it makes sense for a standalone variable reference / subexpression to retain its type, as it does now.)


Note that the case of a token that starts with .$ getting split into 2 arguments is not covered in the help topic at all.


Even more optional reading: following a token that starts with one of the other special characters with additional characters.

Among the other special token-starting characters, the following unconditionally treat any characters that follow the end of the construct as a separate argument (which makes sense):
( ' "

Write-Output (2 + 2)/H   # -> 2 arguments: 4 and '/H'
Write-Output "2 + $a"/H  # -> 2 arguments: '2 + 4' and '/H', assuming $a equals 4
Write-Output '2 + 2'/H   # -> 2 arguments: '2 + 2' and '/H'

As an aside: This shows that bash-style string concatenation - placing any mix of quoted and unquoted tokens right next to each other - is not generally supported in PowerShell; it only works if the 1st substring / variable reference happens to be unquoted. E.g., Write-Output H/'2 + 2', unlike the substrings-reversed example above, produces only a single argument.

The exception is @: while @ does have special meaning (see Get-Help about_Splatting) when followed by just a syntactically valid variable name (e.g., @parms), anything else causes the token to be treated as an expandable string again:

Write-Output @parms    # splatting (results in no arguments if $parms is undefined)

Write-Output @parms$a  # *expandable string*: '@parms4', if $a equals 4
like image 708
mklement0 Avatar asked Feb 07 '17 21:02

mklement0


1 Answers

I think what you're sort of hitting here is more the the type "hinting" than anything else.

You're using Write-Output which specifies in it's Synopsis that it

Sends the specified objects to the next command in the pipeline.

This command is designed to take in an array. When it hits the first item as a string like today/ it treats it like a string. When the first item ends up being the result of a function call, that may or may not be a string, so it starts up an array.

It's telling that if you run the same command to Write-Host (which is designed to take in a string to output) it works as you'd expect it to:

 Write-Host $(Get-Date)/today

Outputs

7/25/2018 1:30:43 PM /today

So I think you're edge cases you're running up against are less about the parsing, and mor about the typing that powershell uses (and tries to hide).

like image 96
SamuelWarren Avatar answered Nov 20 '22 15:11

SamuelWarren