i want to compute all *bin files inside a given directory. Initially I was working with a for-loop
:
var=0 for i in *ls *bin do perform computations on $i .... var+=1 done echo $var
However, in some directories there are too many files resulting in an error: Argument list too long
Therefore, I was trying it with a piped while-loop
:
var=0 ls *.bin | while read i; do perform computations on $i var+=1 done echo $var
The problem now is by using the pipe subshells are created. Thus, echo $var
returns 0
.
How can I deal with this problem?
The original Code:
#!/bin/bash function entropyImpl { if [[ -n "$1" ]] then if [[ -e "$1" ]] then echo "scale = 4; $(gzip -c ${1} | wc -c) / $(cat ${1} | wc -c)" | bc else echo "file ($1) not found" fi else datafile="$(mktemp entropy.XXXXX)" cat - > "$datafile" entropy "$datafile" rm "$datafile" fi return 1 } declare acc_entropy=0 declare count=0 ls *.bin | while read i ; do echo "Computing $i" | tee -a entropy.txt curr_entropy=`entropyImpl $i` curr_entropy=`echo $curr_entropy | bc` echo -e "\tEntropy: $curr_entropy" | tee -a entropy.txt acc_entropy=`echo $acc_entropy + $curr_entropy | bc` let count+=1 done echo "Out of function: $count | $acc_entropy" acc_entropy=`echo "scale=4; $acc_entropy / $count" | bc` echo -e "===================================================\n" | tee -a entropy.txt echo -e "Accumulated Entropy:\t$acc_entropy ($count files processed)\n" | tee -a entropy.txt
$1 means an input argument and -z means non-defined or empty. You're testing whether an input argument to the script was defined when running the script. Follow this answer to receive notifications.
The while loop is used to performs a given set of commands an unknown number of times as long as the given condition evaluates to true. The while statement starts with the while keyword, followed by the conditional expression. The condition is evaluated before executing the commands.
There is no do-while loop in bash. To execute a command first then run the loop, you must either execute the command once before the loop or use an infinite loop with a break condition.
Subshells are one way for a programmer to capture (usually with the intent of processing) the output from a program or script. Commands to be run inside a subshell are enclosed inside single parentheses and preceeded by a dollar sign: DIRCONTENTS=$(ls -l) echo ${DIRCONTENTS}
The problem is that the while loop is executed in a subshell. After the while loop terminates, the subshell's copy of var
is discarded, and the original var
of the parent (whose value is unchanged) is echoed.
One way to fix this is by using Process Substitution as shown below:
var=0 while read i; do # perform computations on $i ((var++)) done < <(find . -type f -name "*.bin" -maxdepth 1)
Take a look at BashFAQ/024 for other workarounds.
Notice that I have also replaced ls
with find
because it is not good practice to parse ls
.
A POSIX compliant solution would be to use a pipe (p file). This solution is very nice, portable, and POSIX, but writes something on the hard disk.
mkfifo mypipe find . -type f -name "*.bin" -maxdepth 1 > mypipe & while read line do # action done < mypipe rm mypipe
Your pipe is a file on your hard disk. If you want to avoid having useless files, do not forget to remove it.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With