Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Weird behaviour of a bash script

Tags:

bash

Here's a snippet:

var=`ls | shuf | head -2 | xargs cat | sed -e 's/\(.\)/\1\n/g' | shuf | tr -d '\n'`

This will select two random files from the current directory, combine their contents, shuffle them, and assign the result to var. This works fine most of the time, but about once in a thousand cases, instead just the output of ls is bound to var (It's not just the output, see EDIT II). What could be the explanation?

Some more potentially relevant facts:

  • the directory contains at least two files
  • there are only text files in the directory
  • file names don't contain spaces
  • the files are anywhere from 5 to about 1000 characters in length
  • the snippet is a part of a larger script that it ran two instances in parallel
  • bash version: GNU bash, version 4.1.5(1)-release (i686-pc-linux-gnu)
  • uname: Linux 2.6.35-28-generic-pae #50-Ubuntu

EDIT: I ran the snippet by itself a couple of thousand times with no errors. Then I tried running it with various other parts of the whole script. Here's a configuration that produces errors:

cd dir_with_text_files
var=`ls | shuf | head -2 | xargs cat | sed -e 's/\(.\)/\1\n/g' | shuf | tr -d '\n'`
cd ..

There are several hundred lines of the script between the cds, but this is the minimal configuration to reproduce the error. Note that the anomalous output binds to var the output of the current directory, not dir_with_text_files.

EDIT II: I've been looking at the outputs in more detail. The ls output doesn't appear alone, it's along with with two shuffled files (between their contents, or after or before them, intact). But it gets better; let me set up the stage to talk about particular directories.

[~/projects/upload] ls -1
checked // dir
lines   // dir, the files to shuffle are here
pages   // also dir
proxycheck
singlepost
uploader
indexrefresh
t
tester

So far, I've seen the output of ls ran from upload, but now I saw the output of ls */* (also ran from upload). It was in the form of "someMangledText ls moreMangledText ls */* finalBatchOfText". Is it possible that the sequence ls that undoubtedly was generated was somehow executed?

like image 656
Vlad Vivdovitch Avatar asked May 20 '11 14:05

Vlad Vivdovitch


People also ask

What is $@ in bash script?

bash [filename] runs the commands saved in a file. $@ refers to all of a shell script's command-line arguments. $1 , $2 , etc., refer to the first command-line argument, the second command-line argument, etc. Place variables in quotes if the values might have spaces in them.

What does $$ mean in bash?

The $$ is the process id of the shell in which your script is running. For more details, see the man page for sh or bash. The man pages can be found be either using a command line "man sh", or by searching the web for "shell manpage"

Is it true 1 or 0 bash?

There are no Booleans in Bash. However, we can define the shell variable having value as 0 (“ False “) or 1 (“ True “) as per our needs.

What is Echo $$ in bash?

The echo command is used to display a line of text that is passed in as an argument. This is a bash command that is mostly used in shell scripts to output status to the screen or to a file.


2 Answers

No problems here either. I would also rewrite the above to this:

sed 's:\(.\):\1\n:g' < <(shuf -e * | head -2 | xargs cat) | shuf | tr -d '\n'

Do not use ls to list a directory's content, use *.
Moreover, do some debugging. Use a shebang followed by:

set -e
set -o pipefail

and run the script like this:

/bin/bash -x /path/to/script

and do inspect the output.
Instead of debugging the whole script, you can surround just the part that seems to be problematic with -x

set -x
...code that may have problems...
set +x

so that the output focuses on that part of the code. Also, use the pipefail option.

Some definitions:

  • -e : Exit immediately if a simple command exits with a non-zero status, unless the command that fails is part of the command list immediately following a while or until keyword, part of the test in an if statement, part of a && or || list, or if the command's return status is being inverted using !. A trap on ERR, if set, is executed before the shell exits
  • -x : Print a trace of simple commands, for commands, case commands, select commands, and arithmetic for commands and their arguments or associated word lists after they are expanded and before they are executed. The value of the PS4 variable is expanded and the resultant value is printed before the command and its expanded arguments
  • pipefail : If set, the return value of a pipeline is the value of the last (rightmost) command to exit with a non-zero status, or zero if all commands in the pipeline exit successfully
like image 130
c00kiemon5ter Avatar answered Oct 03 '22 23:10

c00kiemon5ter


For debugging purposes you may also clear the environment using env -i and filter out non-printable characters:

#!/usr/bin/env -i /bin/bash --

set -ef
set -o pipefail

unset IFS PATH LC_ALL
IFS=$' \t\n'
PATH="$(PATH=/bin:/usr/bin getconf PATH)"
LC_ALL=C
export IFS PATH LC_ALL

#var="$((find . -type f -maxdepth 1 -print0 | shuf -z -n 2 | xargs -0 cat) | sed -e 's/\(.\)/\1\n/g' | shuf | tr -d '\n')"

var="$((find . -type f -maxdepth 1 -print0 | shuf -z -n 2 | xargs -0 cat) | tr -cd '[[:print:]]' | grep -o '.' | shuf | tr -d '\n')"

Before running the script you may also disable the GNU readline library and ! style history expansion:

bash --noediting
set +H
like image 35
narum Avatar answered Oct 04 '22 00:10

narum