Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

POSIX sh equivalent for Bash’s printf %q

Tags:

Suppose I have a #!/bin/sh script which can take a variety of positional parameters, some of which may include spaces, either/both kinds of quotes, etc. I want to iterate "$@" and for each argument either process it immediately somehow, or save it for later. At the end of the script I want to launch (perhaps exec) another process, passing in some of these parameters with all special characters intact.

If I were doing no processing on the parameters, othercmd "$@" would work fine, but I need to pull out some parameters and process them a bit.

If I could assume Bash, then I could use printf %q to compute quoted versions of args that I could eval later, but this would not work on e.g. Ubuntu's Dash (/bin/sh).

Is there any equivalent to printf %q that can be written in a plain Bourne shell script, using only built-ins and POSIX-defined utilities, say as a function I could copy into a script?

For example, a script trying to ls its arguments in reverse order:

#!/bin/sh args= for arg in "$@" do     args="'$arg' $args" done eval "ls $args" 

works for many cases:

$ ./handle goodbye "cruel world" ls: cannot access cruel world: No such file or directory ls: cannot access goodbye: No such file or directory 

but not when ' is used:

$ ./handle goodbye "cruel'st world" ./handle: 1: eval: Syntax error: Unterminated quoted string 

and the following works fine but relies on Bash:

#!/bin/bash args= for arg in "$@" do     printf -v argq '%q' "$arg"     args="$argq $args" done eval "ls $args" 
like image 403
Jesse Glick Avatar asked Aug 28 '12 14:08

Jesse Glick


People also ask

Is printf a Posix?

printf is part of the X/Open Portability Guide since issue 4 of 1992. It was inherited into the first version of POSIX. 1 and the Single Unix Specification.

What is Posix shell script?

POSIX Shell is a command line shell for computer operating system which was introduced by IEEE Computer Society. POSIX stands for Portable Operating System Interface. POSIX Shell is based on the standard defined in Portable Operating System Interface (POSIX) – IEEE P1003.

Is echo faster than printf?

They are both used to output data to the screen. The differences are small: echo has no return value while print has a return value of 1 so it can be used in expressions. echo can take multiple parameters (although such usage is rare) while print can take one argument. echo is marginally faster than print .

Should I use echo or printf?

Printf provides for the creation of a formatting string and offers a non-zero quit status when it fails. Whereas echo normally leaves with a 0 status and typically outputs inputs headed by the end of line character upon this standard result. The “printf” gives you more options for the output format than the “echo”.


2 Answers

This is absolutely doable.

The answer you see by Jesse Glick is approximately there, but it has a couple of bugs, and I have a few more alternatives for your consideration, since this is a problem I ran into more than once.

First, and you might already know this, echo is a bad idea, one should use printf instead, if the goal is portability: "echo" has undefined behavior in POSIX if the argument it receives is "-n", and in practice some implementations of echo treat -n as a special option, while others just treat it as a normal argument to print. So that becomes this:

esceval() {     printf %s "$1" | sed "s/'/'\"'\"'/g" } 

Alternatively, instead of escaping embedded single quotes by making them into:

'"'"' 

..instead you could turn them into:

'\'' 

..stylistic differences I guess (I imagine performance difference is negligible either way, though I've never tested). The resulting sed string looks like this:

esceval() {     printf %s "$1" | sed "s/'/'\\\\''/g" } 

(It's four backslashes because double quotes swallow two of them, and leaving two, and then sed swallows one, leaving just the one. Personally, I find this way more readable so that's what I'll use in the rest of the examples that involve it, but both should be equivalent.)

BUT, we still have a bug: command substitution will delete at least one (but in many shells ALL) of the trailing newlines from the command output (not all whitespace, just newlines specifically). So the above solution works unless you have newline(s) at the very end of an argument. Then you'll lose that/those newline(s). The fix is obviously simple: Add another character after the actual command value before outputting from your quote/esceval function. Incidentally, we already needed to do that anyway, because we needed to start and stop the escaped argument with single quotes. You have two alternatives:

esceval() {     printf '%s\n' "$1" | sed "s/'/'\\\\''/g; 1 s/^/'/; $ s/$/'/" } 

This will ensure the argument comes out already fully escaped, no need for adding more single quotes when building the final string. This is probably the closest thing you will get to a single, inline-able version. If you're okay with having a sed dependency, you can stop here.

If you're not okay with the sed dependency, but you're fine with assuming that your shell is actually POSIX-compliant (there are still some out there, notably the /bin/sh on Solaris 10 and below, which won't be able to do this next variant - but almost all shells you need to care about will do this just fine):

esceval() {     printf \'     unescaped=$1     while :     do         case $unescaped in         *\'*)             printf %s "${unescaped%%\'*}""'\''"             unescaped=${unescaped#*\'}             ;;         *)             printf %s "$unescaped"             break         esac     done     printf \' } 

You might notice seemingly redundant quoting here:

printf %s "${unescaped%%\'*}""'\''" 

..this could be replaced with:

printf %s "${unescaped%%\'*}'\''" 

The only reason I do the former, is because one upon a time there were Bourne shells which had bugs when substituting variables into quoted strings where the quote around the variable didn't exactly start and end where the variable substitution did. Hence it's a paranoid portability habit of mine. In practice, you can do the latter, and it won't be a problem.

If you don't want to clobber the variable unescaped in the rest of your shell environment, then you can wrap the entire contents of that function in a subshell, like so:

esceval() {   (     printf \'     unescaped=$1     while :     do         case $unescaped in         *\'*)             printf %s "${unescaped%%\'*}""'\''"             unescaped=${unescaped#*\'}             ;;         *)             printf %s "$unescaped"             break         esac     done     printf \'   ) } 

"But wait", you say: "What I want to do this on MULTIPLE arguments in one command? And I want the output to still look kinda nice and legible for me as a user if I run it from the command line for whatever reason."

Never fear, I have you covered:

esceval() {     case $# in 0) return 0; esac     while :     do         printf "'"         printf %s "$1" | sed "s/'/'\\\\''/g"         shift         case $# in 0) break; esac         printf "' "     done     printf "'\n" } 

..or the same thing, but with the shell-only version:

esceval() {   case $# in 0) return 0; esac   (     while :     do         printf "'"         unescaped=$1         while :         do             case $unescaped in             *\'*)                 printf %s "${unescaped%%\'*}""'\''"                 unescaped=${unescaped#*\'}                 ;;             *)                 printf %s "$unescaped"                 break             esac         done         shift         case $# in 0) break; esac         printf "' "     done     printf "'\n"   ) } 

In those last four, you could collapse some of the outer printf statements and roll their single quotes up into another printf - I kept them separate because I feel it makes the logic more clear when you can see the starting and ending single-quotes on separate print statements.

P.S. There's also this monstrosity I made, which is a polyfill which will select between the previous two versions depending on if your shell seems to be capable of supporting the necessary variable substitution syntax (it looks awful though, because the shell-only version has to be inside an eval-ed string to keep the incompatible shells from barfing when they see it): https://github.com/mentalisttraceur/esceval/blob/master/sh/esceval.sh

like image 106
mtraceur Avatar answered Sep 20 '22 07:09

mtraceur


I think this is POSIX. It works by clearing $@ after expanding it for the for loop, but only once so that we can iteratively build it back up (in reverse) using set.

flag=0 for i in "$@"; do     [ "$flag" -eq 0 ] && shift $#     set -- "$i" "$@"     flag=1 done  echo "$@"   # To see that "$@" has indeed been reversed ls "$@" 

I realize reversing the arguments was just an example, but you may be able to use this trick of set -- "$arg" "$@" or set -- "$@" "$arg" in other situations.

And yes, I realize I may have just reimplemented (poorly) ormaaj's Push.

like image 37
chepner Avatar answered Sep 19 '22 07:09

chepner