Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Quotes in Node.js spawn arguments

I'm using double quotes in Node.js spawn arguments because they can potentially contain spaces:

const excludes = ['/foo/bar', '/foo/baz', '/foo/bar baz'];
const tar = spawn('tar', [
  '--create', '--gzip',
  // '--exclude="/foo/bar"', '--exclude="/foo/baz"', '--exclude="/foo/bar baz"'
  ...excludes.map(exclude => `--exclude="${exclude}"`),
  '/foo'
], { stdio: ['ignore', 'pipe', 'inherit'] });

For some reason, tar ignores --exclude arguments that are supplied this way. The result is same with spawn being require('child_process').spawn and require('cross-spawn').

--exclude works as expected when there are no double quotes for paths that don't require them.

And the same thing works as expected from shell, even with double quotes:

tar --create --gzip --exclude="/foo/bar" --exclude="/foo/baz" /foo > ./foo.tgz

I'm not sure what's going on there and how spawn can be debugged to check if it does some odd escaping for double quotes.

like image 393
Estus Flask Avatar asked Dec 28 '17 22:12

Estus Flask


3 Answers

This is an issue in quote type precedence. Double-quotes take precedence over single-quotes, so the spawn call breaks down.

The system shell will strip off quotes around parameters, so the program gets the unquoted value at the end. Spawning a process bypasses this step as it bypasses the shell, so the program gets those literal quotes as part of the parameter and doesn't know how to handle them appropriately.

There are two real options to resolve this that I'm aware of:

  1. It's counter-intuitive, but switching the quote types around should resolve this issue. Switch your above code to:

    const tar = spawn("tar", [
      "--create", "--gzip",
      "--exclude='/foo/bar'", "--exclude='/foo/baz'", "/foo"
    ], { stdio: ["ignore", "pipe", "inherit"] });
    
  2. Alternatively, you can use { shell: true } and use your current formatting. This will pass the spawn request through the shell, so the parsing step that's currently being skipped will occur. See more about this here.

    const tar = spawn('tar', [
      '--create', '--gzip',
      '--exclude="/foo/bar"', '--exclude="/foo/baz"', '/foo'
    ], { stdio: ['ignore', 'pipe', 'inherit'], shell: true });
    
like image 155
joshuhn Avatar answered Oct 23 '22 09:10

joshuhn


If I understand what you're asking, you just want to keep the default shell behavior of stripping the quotes and passing the argument as a single argument even if it has spaces.

In that case, you can do:

spawn(exe, args, { windowsVerbatimArguments: true });

See docs:

windowsVerbatimArguments <boolean> No quoting or escaping of arguments is done on Windows. Ignored on Unix. This is set to true automatically when shell is specified and is CMD. Default: false.

like image 6
pushkin Avatar answered Oct 23 '22 09:10

pushkin


You should understand how the shell handles spaces and quotation marks. I say "the shell" - there are different shells and I don't know the differences between them, so it's possible that what I'm about to write won't apply to you. Someone feel free to edit this so it's more precise.

There are all sorts of syntactic complications you can include in a shell command: piped commands, input and output files, interpolated variables, interpolated commands, environment variables, and at least 4 (yes, four) different ways of quoting strings. But for the purposes of this question, let's just say that a shell command is a command name followed by a (possibly empty) list of string arguments. The command name could be a built-in command (cd, ls, sudo, etc), or it could be an executable file. Or, to put it another way, a shell command is a list of one or more strings (including the first string, which tells the shell what kind of command it is).

Because of the complications mentioned above, several characters are special characters. Which means you might need to escape them using quotation marks. However, quotation marks introduce a lot of redundancy into the language. For example, the following commands are equivalent:

tar --create --exclude=/foo/bar /foo
tar --create --exclude='/foo/bar' /foo
tar --create --exclude="/foo/bar" /foo
tar --create '--exclude=/foo/bar' /foo
tar --create "--exclude=/foo/bar" /foo

In each case, the command is to run the executable tar with the argument list --create, --exclude=/foo/bar, /foo.

Note the behaviour of quotation marks, which differs from all other languages I know of. In most languages, a string literal is completely enclosed by a pair of quotation marks - that's how the compiler/interpreter knows where they start and end. But in shell commands, whitespace is what tells the shell where one argument ends and the next one begins. (Quoted/escaped whitespace doesn't count.) The only purpose of quotation marks is to change the way some characters are treated. Shell commands are very flexible about this, so the following commands are also equivalent to the ones above:

tar -"-"create --exc'lude=/fo'o/bar /foo
tar --cr'eate' --exclude"="/foo"/bar" /foo

And when I say these commands are equivalent, I mean the tar executable cannot know which one has been invoked. That is, it is impossible to write an executable file mycommand such that the commands mycommand foo and mycommand "foo" write different output to STDOUT or STDERR, or return different exit codes, or otherwise behave differently.

However, when running shell commands from nodejs, you don't need to use the shell features for piping, streaming to/from files, interpolating variables, etc, because javascript can handle all that stuff if you want. So when you supply the arguments to spawn, it bypasses those shell features; it doesn't do anything with shell special characters. You just supply the arguments directly. So in the following example, one of the arguments will be --exclude=/foo/bar baz, which will cause tar to ignore the file/directory called bar baz in the /foo directory:

const tar = spawn('tar', [
  '--create', '--gzip',
  '--exclude=/foo/bar', '--exclude=/foo/baz', '--exclude=/foo/bar baz',
  '/foo'
], { stdio: ['ignore', 'pipe', 'inherit'] });

(Although obviously, if you're using javascript string literals, you might need to escape some characters at the javascript level.)

I don't like either of joshuhn's answers. (1) didn't even work for me, and I'm surprised it worked for him - if it did then I view it as a bug in nodejs (or possibly in tar). (I'm running nodejs v6.9.5 in Ubuntu 16.04.3 LTS, with GNU tar v1.28.) As for (2), it means unnecessarily introducing all the complexities of shell string processing into your javascript code. As the documentation says:

Note: If the shell option is enabled, do not pass unsanitised user input to this function. Any input containing shell metacharacters may be used to trigger arbitrary command execution.

I for one don't know all the intricacies of shell escaping, so I would not risk running spawn with the shell option with untrusted input.

like image 5
David Knipe Avatar answered Oct 23 '22 10:10

David Knipe