I'm writing scripts that will run in parallel and will get their input data from the same file. These scripts will open the input file, read the first line, store it for further treatment and finally erase this read line from the input file.
Now the problem is that multiple scripts accessing the file can lead to the situation where two scripts access the input file simultaneously and read the same line, which produces the unacceptable result of the line being processed twice.
Now one solution is to write a lock file (.lock_input
) before accessing the input file, and then erase it when releasing the input file, but this solution is not appealing in my case because sometimes NFS slows down network communication randomly and may not have reliable locking.
Another solution is to put a process lock instead of writing a file, which means the first script to access the input file will launch a process called lock_input, and the other scripts will ps -elf | grep lock_input
. If it is present on the process list they will wait. This may be faster than writing to the NFS but still not perfect solution ...
So my question is: Is there any bash command (or other script interpreter) or a service I can use that will behave like semaphore or mutex locks used for synchronization in thread programming?
Thank you.
Small rough example:
Let's say we have input_file as following:
Monday Tuesday Wednesday Thursday Friday Saturday Sunday
Treatment script : TrScript.sh
#!/bin/bash
NbLines=$(cat input_file | wc -l)
while [ ! $NbLines = 0 ]
do
FirstLine=$(head -1 input_file)
echo "Hello World today is $FirstLine"
RemainingLines=$(expr $NbLines - 1 )
tail -n $RemainingLines input_file > tmp
mv tmp input_file
NbLines=$(cat input_file | wc -l)
done
Main script:
#! /bin/bash
./TrScript.sh &
./TrScript.sh &
./TrScript.sh &
wait
The result should be:
Hello World today is Monday Hello World today is Tuesday Hello World today is Wednesday Hello World today is Thursday Hello World today is Friday Hello World today is Saturday Hello World today is Sunday
Near the top of the script, add something like: trap " [ -f /var/run/my. lock ] && /bin/rm -f /var/run/my. lock" 0 1 2 3 13 15 You can search /usr/bin/* for more examples.
Basically holding a lock on the file until the shell closes. Since scripts run in a sub-shell, the file will be closed (or unlocked) when the script exits. The "|| exit 1" instructs the script to exit if a lock cannot be obtained. That's it, we just created a file lock using flock.
File locking is not mandatory locking — it is advisory locking. That means that if a program such as cat does not look to see whether a file is locked, it doesn't matter whether some other program locks it or not — cat will still read the file.
use
line=`flock $lockfile -c "(gawk 'NR==1' < $infile ; gawk 'NR>1' < $infile > $infile.tmp ; mv $infile.tmp $infile)"`
for accessing the file you want to read from. This uses file locks, though.
gawk NR==1 < ...
prints the first line of the input
I have always liked the lockfile program (sample search result for lockfile manpage) from the procmail set of tools (should be available on most systems, though it might not be installed by default).
It was designed to lock mail spool files, which are (were?) commonly mounted via NFS, so it does work properly over NFS (as much as anything can).
Also, as long as you you are making the assumption that all your ‘workers’ are on the same machine (by assuming you can check for PIDs, which may not work properly when PIDs eventually wrap), you could put your lock file in some other, local, directory (e.g. /tmp) while processing files hosted on an NFS server. As long as all the workers use the same lock file location (and a one-to-one mapping of lockfile filenames to locked pathnames), it will work fine.
Using FLOM (Free LOck Manager) tool your main script can become as easy as:
#!/bin/bash
flom -- ./TrScript.sh &
flom -- ./TrScript.sh &
flom -- ./TrScript.sh &
wait
if you are running the script inside a single host and something like:
flom -A 224.0.0.1 -- ./TrScript.sh &
if you want to distribute your script on many hosts. Some usage examples are available at this URL: http://sourceforge.net/p/flom/wiki/FLOM%20by%20examples/
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With