I have the following situation:
There is a windows folder that has been mounted on a Linux machine. There could be multiple folders (setup before hand) in this windows mount. I have to do something (preferably a script to start with) to watch these folders.
These are the steps: Watch for any incoming file(s). Make sure they are transferred completely. Move it to another folder. I do not have any control over the file transfer program on the windows machine. It is a secure FTP I believe. So I cannot ask that process to send me a trailer file to ensure the completion of file transfer.
I have written a bash script. I would like to know about any potential pitfalls with this approach. Reason is, there is a possibility of mulitple copies of this script running for multiple directories like this.
At the moment, there could be upto 100 directories that may have to be monitored.
Following is the script. I'm sorry for pasting a very long one here. Please take your time to review it and comment / criticize it. :-)
It takes 3 parameters, the folder that has to be watched, the folder where the file has to be moved, and a time interval, which has been explained below.
I'm sorry there seems to be a problem with the alignment. Markdown doesn't seem to like it. I tried to organize it properly, but not able to do so.
Linux servername 2.6.9-42.ELsmp #1 SMP Wed Jul 12 23:27:17 EDT 2006 i686 i686 i386
GNU/Linux
#!/bin/bash
log_this()
{
message="$1"
now=`date "+%D-%T"`
echo $$": "$now ": " $message
}
usage()
{
cat << EOF
Usage: $0 <Directory to be watched> <Directory to transfer> <time interval>
Time interval is the amount of time after which the modification time of a
file will be monitored.
EOF
`exit 1`
}
if [ $# -lt 2 ]
then
usage
fi
WATCH_DIR=$1
APP_DIR=$2
if [ ! -d "$WATCH_DIR" ]
then
log_this "FATAL: WATCH_DIR, $WATCH_DIR does not exist. Exiting"
exit 1
fi
if [ ! -d "$APP_DIR" ]
then
log_this "APP_DIR: $APP_DIR does not exist. Exiting"
exit 1
fi
# This needs to be set after considering the rate of file transfer.
# Represents the seconds elapsed after the last modification to the file.
# If not supplied as parameter, defaults to 3.
seconds_between_mods=$3
if ! [[ "$seconds_between_mods" =~ ^[0-9]+$ ]]; then
if [ ${#seconds_between_mods} -eq 0 ]; then
log_this "No value supplied for elapse time. Defaulting to 3."
seconds_between_mods=3
else
log_this "Invalid value provided for elapse time"
exit 1
fi
fi
log_this "Start Monitor."
while true
do
ls -1 $WATCH_DIR | while read file_name
do
log_this "Start Monitoring for $file_name"
# Refer only the modification with reference to the mount folder.
# If there is a diff in time between servers, we are in trouble.
token_file=$WATCH_DIR/foo.$$
current_time=`touch $token_file && stat -c "%Y" $token_file`
rm -f $token_file 2>/dev/null
log_this "Current Time: $current_time"
last_mod_time=`stat -c "%Y" $WATCH_DIR/$file_name`
elapsed_time=`expr $current_time - $last_mod_time`
log_this "Elapsed time ==> $elapsed_time"
if [ $elapsed_time -ge $seconds_between_mods ]
then
log_this "Moving $file_name to $APP_DIR"
# In case if there is no space left on the target mount, hide the file
# in the mount itself and remove the incomplete file from APP_DIR.
mv $WATCH_DIR/$file_name $APP_DIR
if [ $? -ne 0 ]
then
log_this "FATAL: mv failed!! Hiding $file_name"
rm $APP_DIR/$file_name
mv $WATCH_DIR/$file_name $WATCH_DIR/.$file_name
log_this "Removed $APP_DIR/$file_name. Look for $WATCH_DIR/.$file_name and submit later."
fi
log_this "End Monitoring for $file_name"
else
log_this "$file_name: Transfer seems to be in progress"
fi
done
log_this "Nothing more to monitor."
echo
sleep 5
done
This isn't going to work for any length of time. In production, you will have network problems and other errors which can leave a partial file in the upload directory. I also don't like the idea of a "trailer" file. The usual approach is to upload the file under a temporary name and then rename it after the upload completes.
This way, you just have to list the directory, filter the temporary names out and and if there is anything left, use it.
If you can't make this change, then ask your boss for a written permission to implement something which can lead to arbitrary data corruption. This is for two purposes: 1) To make them understand that this is a real problem and not something which you make up and 2) to protect yourself when it breaks ... because it will and guess who'll get all the blame?
I believe a much saner approach would be the use of a kernel-level filesystem notify item. Such as inotify. Get also the tools here.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With