Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

PathGetArgs/PathRemoveArgs vs. CommandLineToArgvW - is there a difference?

I'm working on some path-parsing C++ code and I've been experimenting with a lot of the Windows APIs for this. Is there a difference between PathGetArgs/PathRemoveArgs and a slightly-massaged CommandLineToArgvW?

In other words, aside from length/cleanness, is this:

std::wstring StripFileArguments(std::wstring filePath)
{
  WCHAR tempPath[MAX_PATH];

  wcscpy(tempPath, filePath.c_str());
  PathRemoveArgs(tempPath);

  return tempPath;
}

different from this:

std::wstring StripFileArguments(std::wstring filePath)
{
  LPWSTR* argList;
  int argCount;
  std::wstring tempPath;

  argList = CommandLineToArgvW(filePath.c_str(), &argCount);

  if (argCount > 0)
  {
    tempPath = argList[0]; //ignore any elements after the first because those are args, not the base app

    LocalFree(argList);

    return tempPath;
  }

  return filePath;
}

and is this

std::wstring GetFileArguments(std::wstring filePath)
{
  WCHAR tempArgs[MAX_PATH];

  wcscpy(tempArgs, filePath.c_str());
  wcscpy(tempArgs, PathGetArgs(tempArgs));

  return tempArgs;
}

different from

std::wstring GetFileArguments(std::wstring filePath)
{
  LPWSTR* argList;
  int argCount;
  std::wstring tempArgs;

  argList = CommandLineToArgvW(filePath.c_str(), &argCount);

  for (int counter = 1; counter < argCount; counter++) //ignore the first element (counter = 0) because that's the base app, not args
  {
    tempArgs = tempArgs + TEXT(" ") + argList[counter];
  }

  LocalFree(argList);

  return tempArgs;
}

? It looks to me like PathGetArgs/PathRemoveArgs just provide a cleaner, simpler special-case implementation of the CommandLineToArgvW parsing, but I'd like to know if there are any corner cases in which the APIs will behave differently.

like image 608
cf stands with Monica Avatar asked Nov 20 '13 18:11

cf stands with Monica


2 Answers

The functions are similar but not exactly the same - mostly relating to how quoted strings are handled.

PathGetArgs returns a pointer to the first character following the first space in the input string. If a quote character is encountered before the first space, another quote is required before the function will start looking for spaces again. If no space is found the function returns a pointer to the end of the string.

PathRemoveArgs calls PathGetArgs and then uses the returned pointer to terminate the string. It will also strip a trailing space if the first space encountered happened to be at the end of the line.

CommandLineToArgvW takes the supplied string and splits it into an array. It uses spaces to delineate each item in the array. The first item in the array can be quoted to allow spaces. The second and subsequent items can also be quoted, but they support slightly more complex processing - arguments can also include embedded quotes by prepending them with a backslash. For example:

 "c:\program files\my app\my app.exe" arg1 "argument 2" "arg \"number\" 3"

This would produce an array with four entries:

  • argv[0] - c:\program files\my app\my app.exe
  • argv[1] - arg1
  • argv[2] - argument 2
  • argv[3] - arg "number" 3

See the CommandLineToArgVW docs for a full description of the parsing rules, including how you can have embedded backslashes as well as quotes in the arguments.

like image 123
Jonathan Potter Avatar answered Oct 21 '22 15:10

Jonathan Potter


Yes I've observed a different behaviour with the current SDK (VS2015 Update 3 + Windows 1607 Anniversary SDK with SDK version set to 8.1):

  1. Calling CommandLineToArgvW with an empty lpCmdLine (what you get from wWinMain when no arguments were passed) returns the program path and filename, which will be split-up on every space. But this was not specified in the parameter, it must have done that itself but failed to think about ignoring spacing that path itself:

    lpCmdLine = ""
    argv[0] = C:\Program
    argv[1] = Files\Vendor\MyProgram.exe
    
  2. Calling CommandLineToArgvW with lpCmdLine containing parameters, does not include the program path and name, so works as expected (so long as there are no further spaces in the parameters...):

    lpCmdLine = "One=1 Two=\"2\""
    argv[0] = One=1
    argv[1] = Two=2
    

Note it also strips any other quotes inside the parameters when passed.

  1. CommandLineToArgvW doesn't like the first parameter in the format Text=\"Quoted spaces\" so if you try to pass lpCmdLine to it directly it incorrectly splits the key=value pairs if they have spaces:

    lpCmdLine = "One=\"Number One\" Two=\"Number Two\""
    argv[0] = One=\"Number
    argv[1] = One\"
    argv[2] = Two=\"Number
    argv[3] = Two\"
    

It's kind of documented here:

https://msdn.microsoft.com/en-us/library/windows/desktop/bb776391(v=vs.85).aspx

But this kind of behaviour with spaces in the program path was not expected. It seems like a bug to me. I'd prefer the same data to be processed in both situations. Because if I really want the path to the executable I'd call GetCommandLineW() instead.

The only sensible consistent solution in my opinion is to totally ignore lpCmdLine and call GetCommandLineW(), pass the results to CommandLineToArgvW() then skip the first parameter if you are not interested in the program path. That way, all combinations are supported, i.e. path with and without spaces, parameters with nested quotes with and without spaces.

int argumentCount;
LPWSTR commandLine = GetCommandLineW();
LPWSTR *arguments = CommandLineToArgvW(commandLine, &argumentCount);
like image 3
Tony Wall Avatar answered Oct 21 '22 14:10

Tony Wall