Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Bash script size limitation?

I have a bash script that, when run on RHEL or OS X, gives the following error:

line 62484: syntax error near unexpected token `newline'

line 62484: ` -o_gz'

This is an auto-generated script to work around a limitation introduced by the grid engine compute cluster used in my company. It is all composed of a bunch of almost-identical if/elif's. I can't see anything special with the line where the error comes from. When I run the beginning part of the script before the error line, it works without problems. This makes me think that there may be some bash script length limitation. The only reference I could find on the web was the comment by iAdjunct.

The part of the script around the error looks like this (with some simplifications):

.
.
.
.
elif [ $task_number -eq 2499 ]
then
    /some/tool/executable \
    -use_prephased_g \
    -m \  
    /some/text/file \
    -h \  
    /some/zipped/file \
    -l \  
    -int \
     45063854 \
     46063853 \
    -Ne \ 
     20000 \
    -o \  
    /some/output/file \
    -verbose \
    -o_gz #==============> ****THIS IS LINE 62484****
elif [ $task_number -eq 2500 ]
then
    /some/tool/executable \
    -use_prephased_g \
    -m \  
    /some/other/text/file \
    -h \  
    /some/other/zipped/file \
    -l \  
    -int \
     98232182 \
     99232182 \
    -Ne \ 
     20000 \
    -o \  
    /some/other/output/file \
    -verbose \
    -o_gz
elif [ $task_number -eq 2501 ] 
.
.
.
.

Does this ring any bells for anyone?

like image 664
laylaylom Avatar asked Jul 29 '16 15:07

laylaylom


People also ask

How large can a Bash variable be?

It's 128KB, like the former ARG_MAX.

What is the maximum number of lines allowed in a shell script?

The shell/OS imposed limit is usually one or two hundred thousand characters. getconf ARG_MAX will give you the maximum input limit for a command. On the Debian system I currently have a terminal open on this returns 131072 which is 128*1024 .

Does space matter in Bash?

The lack of spaces is actually how the shell distinguishes an assignment from a regular command. Also, spaces are required around the operators in a [ command: [ "$timer"=0 ] is a valid test command, but it doesn't do what you expect because it doesn't recognize = as an operator.


1 Answers

Yes, this is a limitation with bash.

It's not a script size limit; rather it's a limit to the depth of the parser stack, which has the effect of restricting the complexity of certain constructs. In particular, it will restrict the number of elif clauses in an if statement to about 2500.

There is a longer analysis of this problem with respect to a different syntactic construct (iterated pipes) in my answer to a question on the Unix & Linux stackexchange site.

case statements don't have this limitation, and the sample you provide certainly looks like a good match for a case statement.

(The difference with case statements is that the grammar for if conditional statements, like that of pipe constructs, is right recursive, while the grammar for case statements is left recursive. The reason the limitation on if statements is different from the limitation on pipes is that the grammatical construct for an elif clause has one more symbol, so each repetition uses four stack slots rather than three.)

If the case statement doesn't work for you -- or even if it does -- you could try building a precompiled binary search tree of if statements:

if (( task_number < 8 )); then
  if (( task_number < 4 )); then
    if (( task_number < 2 )); then
      if (( task_number < 1)); then
        # do task 0
      else
        # do task 1
      fi;
    elif (( task_number < 3 )); then
      # do task 2
    else
      # do task 3
    fi
  elif (( task_number < 6 )); then
    if (( task_number < 5 )); then
      # do task 4
    else
      # do task 5
    fi
  elif (( task_number < 7 )); then
    # do task 6
  else
    # do task 7
  fi
elif (( task_number < 12 )); then
  if (( task_number < 10 )); then
    if (( task_number < 9 )); then
      # do task 8
    else
      # do task 9
    fi
  elif (( task_number < 11 )); then
    # do task 10
  else
    # do task 11
  fi
elif (( task_number < 14 )); then
  if (( task_number < 13 )); then
    # do task 12
  else
    # do task 13
  fi
elif (( task_number < 15 )); then
  # do task 14
else
  # do task 15
fi

Because each complete if statement only occupies a single stack node after it is recognized, the complexity limitation will be on the nesting depth of the if statements rather than the number of clauses. As an additional bonus, it will execute a lot fewer comparisons in the average case.

If you have no alternative other than a sequential list of conditions, you can use separate if statements:

while :; do
  if condition1; then
    # do something
  break; fi; if condition2; then
    # do something
  break; fi; if condition3; then
    # do something
  break; fi; if condition4; then
    # do something
  break; fi
  # No alternative succeeded
  break
done

The unconventional indent is intended to illustrate the simple program transformation: simply replace every elif with break;fi;if and surround the whole thing with a while (to provide the target for the breaks.)

like image 159
rici Avatar answered Sep 21 '22 01:09

rici