Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Why do Tcler suggest to brace your `expr`essions?

Tags:

tcl

We can evaluate the two expression in two possible ways:

   set a 1
   set b 1
   puts [expr $a + $b ]
   puts [expr {$a + $b } ]

But why hate experienced Tclers the first one, and consider it as bad practice? Does the first usage of expr has some security concern?

like image 395
made_in_india Avatar asked Jul 03 '13 15:07

made_in_india


People also ask

What does EXPR do in Tcl?

The expression parser will perform backslash, variable, and command substitutions on the information between the quotes, and use the resulting value as the operand. As a string enclosed in braces. The characters between the open brace and matching close brace will be used as the operand without any substitutions.

What is double in Tcl?

double is an expr function (or a command in the tcl::mathfunc namespace) [L1 ] for converting tcl numeric strings to floating-point and returning the converted value.


2 Answers

The "problem" with expr is that it implements its own "mini language", which includes, among other things, variable substitution (replacing those $a-s with their values) and command substitution (replacing those [command ...] things with the results of running commands), so basically the process of evaluating expr $a + $b goes like this:

  1. The Tcl interpreter parses out four words — expr, $a, + and $b out of the source string. Since two of these words begin with $, variable substitution takes place so really there will be expr, 1, +, and 2.
  2. As usually, the first word is taken to be the name of a command, and others are arguments to it, so the Tcl interpreter looks up a command named expr, and executes it passing it the three arguments: 1, +, and 2.
  3. The implementation if expr then concatenates all the arguments passed to it interpreting them as strings, obtaining a string 1 + 2.
  4. This string is then parsed again — this time by the expr machinery, according to its own rules which include variable- and command substitutions, as already mentioned.

What follows:

  • If you brace your expressions, like in expr {$a + $b}, grouping provided by those curly braces inhibits interpretation by the Tcl interpreter1 of the script intended to be parsed by expr itself. This means in our toy example the expr command would see exactly one argument, $a + $b, and will perform substitutions itself.
  • "Double parsing" explained above might lead to security problems.

    For example, in the following code

    set a {[exec echo rm -rf $::env(HOME)]}
    set b 2
    expr $a + $b
    

    The expr command will itself parse a string [exec echo rm -rf $::env(HOME)] + 2. Its evaluation will fail, but by that time, the contents of your home directory will be supposedly gone. (Note that a kind Tcler placed echo in front of rm in a later edit to my answer in an attempt to save the necks of random copypasters, so the command as written won't call rm but if you remove echo from it, it will.)

  • Double parsing inhibits certain optimisations the Tcl engine can do when dealing with calls to expr.

1 Well, almost — "backslash+newline" sequences are still processed even inside {...} blocks.

like image 101
kostix Avatar answered Nov 15 '22 14:11

kostix


It most certainly has security issues. In particular, it will treat the variables' contents as expression fragments rather than values, and this lets all sort of problems occur. If that's not enough, the same problems also totally slay performance because there is no way to generate reasonably optimal code for it: the bytecode generated will be far less efficient since all it can do is assemble the expression string and send it for a second round of parsing.

Let's drill down to the details

% tcl::unsupported::disassemble lambda {{} {
    set a 1; set b 2
    puts [expr {$a + $b}]
    puts [expr $a + $b]
}}
ByteCode 0x0x50910, refCt 1, epoch 3, interp 0x0x31c10 (epoch 3)
  Source "\n    set a 1; set b 2\n    puts [expr {$a + $b}]\n    put"
  Cmds 6, src 72, inst 65, litObjs 5, aux 0, stkDepth 6, code/src 0.00
  Proc 0x0x6d750, refCt 1, args 0, compiled locals 2
      slot 0, scalar, "a"
      slot 1, scalar, "b"
  Commands 6:
      1: pc 0-4, src 5-11          2: pc 5-18, src 14-20
      3: pc 19-37, src 26-46       4: pc 21-34, src 32-45
      5: pc 38-63, src 52-70       6: pc 40-61, src 58-69
  Command 1: "set a 1"
    (0) push1 0     # "1"
    (2) storeScalar1 %v0    # var "a"
    (4) pop 
  Command 2: "set b 2"
    (5) startCommand +13 1  # next cmd at pc 18
    (14) push1 1    # "2"
    (16) storeScalar1 %v1   # var "b"
    (18) pop 
  Command 3: "puts [expr {$a + $b}]"
    (19) push1 2    # "puts"
  Command 4: "expr {$a + $b}"
    (21) startCommand +14 1     # next cmd at pc 35
    (30) loadScalar1 %v0    # var "a"
    (32) loadScalar1 %v1    # var "b"
    (34) add 
    (35) invokeStk1 2 
    (37) pop 
  Command 5: "puts [expr $a + $b]"
    (38) push1 2    # "puts"
  Command 6: "expr $a + $b"
    (40) startCommand +22 1     # next cmd at pc 62
    (49) loadScalar1 %v0    # var "a"
    (51) push1 3    # " "
    (53) push1 4    # "+"
    (55) push1 3    # " "
    (57) loadScalar1 %v1    # var "b"
    (59) concat1 5 
    (61) exprStk 
    (62) invokeStk1 2 
    (64) done 

In particular, look at the addresses 30–34 (the compilation of expr {$a + $b}) and compare with addresses 49–61 (the compilation of expr $a + $b). The optimal code reads the values out of the two variables and just adds them; the unbraced code has to read the variables and concatenate with the literal parts of the expression, and then fires the result into exprStk which is the “evaluate an expression string” operation. (The relative number of bytecodes isn't the problem; the problem is the runtime evaluation.)

For how fundamental these differences could be, consider setting a to 1 || 0 and b to [exit 1]. In the case of the precompiled version, Tcl will just try to treat both sides as numbers to add (neither of which is actually numeric; you'll get an error). In the case of the dynamic version… well, can you predict it by inspection?

So what do you do?

Optimal Tcl code should always limit the amount of runtime evaluation of expressions it performs; you can usually get it down to nothing at all unless you're doing something that takes an expression defined by the user or something like that. Where you have to have it, try to generate a single expression string in a variable and then just use expr $thatVar rather than anything more complex. If you're wanting to do adding a list of numbers (or generally applying any operator to combine them), consider using this:

set sum [tcl::mathop::+ {*}$theList]

instead of:

set sum [expr [join $theList "+"]]

(Also, never use a dynamic expression with if, for or while as that will suppress a lot of compilation.)

Remember, with Tcl it's (usually) the case that safe code is fast code. You want fast and safe code, right?

like image 44
Donal Fellows Avatar answered Nov 15 '22 15:11

Donal Fellows