I'd like to print a string literal in AWK / gawk using the PowerShell command line (the specific program is unimportant). However, I think I misunderstand the quoting rules somewhere along the line -- PowerShell apparently removes double quotes inside single quotes for native commands, but not when passing them to commandlets.
This works in Bash:
bash$ awk 'BEGIN {print "hello"}'
hello <-- GOOD
And this works in PowerShell -- but importantly I have no idea why the escaping is needed:
PS> awk 'BEGIN {print \"hello\"}'
hello <-- GOOD
This prints nothing in PowerShell:
PS> awk 'BEGIN {print "hello"}'
<-- NOTHING IS BAD
If this really is the only way of doing this in PowerShell, then I'd like to understand the chain of quoting rules that explains why. According to the PowerShell quoting rules at About Quoting Rules, this shouldn't be necessary.
BEGIN SOLUTION
The punchline, courtesy of Duncan below, is that you should add this function to your PowerShell profile:
filter Run-Native($command) { $_ | & $command ($args -replace'(\\*)"','$1$1\"') }
Or specifically for AWK:
filter awk { $_ | gawk.exe ($args -replace'(\\*)"','$1$1\"') }
END SOLUTION
The quotes are properly passed to PowerShell's echo:
PS> echo '"hello"'
"hello" <-- GOOD
But when calling out to an external "native" program, the quotes disappear:
PS> c:\cygwin\bin\echo.exe '"hello"'
hello <-- BAD, POWERSHELL REMOVED THE QUOTES
Here's an even cleaner example, in case you're concerned that Cygwin might have something to do with this:
echo @"
>>> // program guaranteed not to interfere with command line parsing
>>> public class Program
>>> {
>>> public static void Main(string[] args)
>>> {
>>> System.Console.WriteLine(args[0]);
>>> }
>>> }
>>> "@ > Program.cs
csc.exe Program.cs
.\Program.exe '"hello"'
hello <-- BAD, POWERSHELL REMOVED THE QUOTES
DEPRECATED EXAMPLE for passing to cmd, which does its own parsing (see Etan's comment below):
PS> cmd /c 'echo "hello"'
"hello" <-- GOOD
DEPRECATED EXAMPLE for passing to Bash, which does its own parsing (see Etan's comment below):
PS> bash -c 'echo "hello"'
hello <-- BAD, WHERE DID THE QUOTES GO
Any solutions, more elegant workarounds, or explanations?
The problem here is that the Windows standard C runtime strips unescaped double quotes out of arguments when parsing the command line. PowerShell passes arguments to native commands by putting double quotes around the arguments, but it doesn't escape any double quotes that are contained in the arguments.
Here's a test program that prints out the arguments it was given using the C stdlib, the 'raw' command line from Windows, and the Windows command line processing (which seems to behave identically to the stdlib):
C:\Temp> type t.c
#include <stdio.h>
#include <windows.h>
#include <ShellAPI.h>
int main(int argc,char **argv){
int i;
for(i=0; i < argc; i++) {
printf("Arg[%d]: %s\n", i, argv[i]);
}
LPWSTR *szArglist;
LPWSTR cmdLine = GetCommandLineW();
wprintf(L"Command Line: %s\n", cmdLine);
int nArgs;
szArglist = CommandLineToArgvW(GetCommandLineW(), &nArgs);
if( NULL == szArglist )
{
wprintf(L"CommandLineToArgvW failed\n");
return 0;
}
else for( i=0; i<nArgs; i++) printf("%d: %ws\n", i, szArglist[i]);
// Free memory allocated for CommandLineToArgvW arguments.
LocalFree(szArglist);
return 0;
}
C:\Temp>cl t.c "C:\Program Files (x86)\Windows Kits\8.1\lib\winv6.3\um\x86\shell32.lib"
Microsoft (R) C/C++ Optimizing Compiler Version 18.00.21005.1 for x86
Copyright (C) Microsoft Corporation. All rights reserved.
t.c
Microsoft (R) Incremental Linker Version 12.00.21005.1
Copyright (C) Microsoft Corporation. All rights reserved.
/out:t.exe
t.obj
"C:\Program Files (x86)\Windows Kits\8.1\lib\winv6.3\um\x86\shell32.lib"
Running this in cmd
we can see that all unescaped quotes are stripped, and spaces only separate arguments when there have been an even number of unescaped quotes:
C:\Temp>t "a"b" "\"escaped\""
Arg[0]: t
Arg[1]: ab "escaped"
Command Line: t "a"b" "\"escaped\""
0: t
1: ab "escaped"
C:\Temp>t "a"b c"d e"
Arg[0]: t
Arg[1]: ab
Arg[2]: cd e
Command Line: t "a"b c"d e"
0: t
1: ab
2: cd e
PowerShell behaves a bit differently:
C:\Temp>powershell
Windows PowerShell
Copyright (C) 2012 Microsoft Corporation. All rights reserved.
C:\Temp> .\t 'a"b'
Arg[0]: C:\Temp\t.exe
Arg[1]: ab
Command Line: "C:\Temp\t.exe" a"b
0: C:\Temp\t.exe
1: ab
C:\Temp> $a = "string with `"double quotes`""
C:\Temp> $a
string with "double quotes"
C:\Temp> .\t $a nospaces
Arg[0]: C:\Temp\t.exe
Arg[1]: string with double
Arg[2]: quotes
Arg[3]: nospaces
Command Line: "C:\Temp\t.exe" "string with "double quotes"" nospaces
0: C:\Temp\t.exe
1: string with double
2: quotes
3: nospaces
In PowerShell, any argument that contains spaces is enclosed in double quotes. Also the command itself gets quotes even when there aren't any spaces. Other arguments aren't quoted even if they include punctuation such as double quotes, and and I think this is a bug PowerShell doesn't escape any double quotes that appear inside the arguments.
In case you're wondering (I was), PowerShell doesn't even bother to quote arguments that contain newlines, but neither does the argument processing consider newlines as whitespace:
C:\Temp> $a = @"
>> a
>> b
>> "@
>>
C:\Temp> .\t $a
Arg[0]: C:\Temp\t.exe
Arg[1]: a
b
Command Line: "C:\Temp\t.exe" a
b
0: C:\Temp\t.exe
1: a
b
The only option since PowerShell doesn't escape the quotes for you seems to be to do it yourself:
C:\Temp> .\t 'BEGIN {print "hello"}'.replace('"','\"')
Arg[0]: C:\Temp\t.exe
Arg[1]: BEGIN {print "hello"}
Command Line: "C:\Temp\t.exe" "BEGIN {print \"hello\"}"
0: C:\Temp\t.exe
1: BEGIN {print "hello"}
To avoid doing that every time, you can define a simple function:
C:\Temp> function run-native($command) { & $command $args.replace('\','\\').replace('"','\"') }
C:\Temp> run-native .\t 'BEGIN {print "hello"}' 'And "another"'
Arg[0]: C:\Temp\t.exe
Arg[1]: BEGIN {print "hello"}
Arg[2]: And "another"
Command Line: "C:\Temp\t.exe" "BEGIN {print \"hello\"}" "And \"another\""
0: C:\Temp\t.exe
1: BEGIN {print "hello"}
2: And "another"
N.B. You have to escape backslashes as well as double quotes otherwise this doesn't work (this doesn't work, see further edit below):
C:\Temp> run-native .\t 'BEGIN {print "hello"}' 'And \"another\"'
Arg[0]: C:\Temp\t.exe
Arg[1]: BEGIN {print "hello"}
Arg[2]: And \"another\"
Command Line: "C:\Temp\t.exe" "B EGIN {print \"hello\"}" "And \\\"another\\\""
0: C:\Temp\t.exe
1: BEGIN {print "hello"}
2: And \"another\"
Another edit: Backslash and quote handling in the Microsoft universe is even weirder than I realised. Eventually I had to go and read the C stdlib sources to find out how they interpret backslashes and quotes:
/* Rules: 2N backslashes + " ==> N backslashes and begin/end quote
2N+1 backslashes + " ==> N backslashes + literal "
N backslashes ==> N backslashes */
So that means run-native
should be:
function run-native($command) { & $command ($args -replace'(\\*)"','$1$1\"') }
and all backslashes and quotes will survive the command line processing. Or if you want to run a specific command:
filter awk() { $_ | awk.exe ($args -replace'(\\*)"','$1$1\"') }
(Updated following @jhclark's comment: it needs to be a filter to allow piping into stdin.)
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With