From the rsync manual documentation I see that by using the option rsync-path, it is possible to specify what program is to be run on the remote machine to start up rsync. In particular, the program could be a wrapper script which calls the actual rsync command in the middle, but which does some actions before and/or after the rsync invocation. One possible interesting use would be to acquire/release a lock (e.g., a flock), so that the operations of rsync at the remote end could be co-ordinated with another process at the far end which is contending for write access to the same files. There could be multiple rsync processes simultaneously holding the shared lock (I am aware of potential for starvation but am not concerned about that right now). The 'writer' process I'm dealing with would just be changing a few hard-links, so it would not block the rsync process for any significant lengh of time. I have looked at other co-ordination approaches, e.g., implementing a custom remote locking protocol between the client and server, but they all involve more development work and/or are unsatisfactory for other reasons, which is why I am interested in the wrapper/(f)lock approach. My questions are: 1) Is this a reasonable way to solve the problem of co-ordinating rsync 'readers' with another, 'writer' process accessing the same directory? 2) Can you also put a wrapper around rsync when using the inetd (or xinetd) daemon approach to running rsync, by adding a line something like the following to /etc/inetd.conf (as per the rsyncd.conf man page): rsync stream tcp nowait root /usr/bin/rsync rsyncd --daemon but replacing /usr/bin/rsync with the path to your rsync-lookalike wrapper, which in this case would be a C/C++ -code program which seizes a lock, forks off rsync, waits for rsync to complete, then releases the lock. Thanks, Tom

Thanks to the question and the comments. Armed with your ideas I solved it (for me) using --rsync-path but without any wrapper scrips on the remote host, simply by putting all payload script into --rsync-path, with a few tricks. This particular example uses rsync to pull data from remote host while holding a flock on the remote host, e.g. remote host dumps data periodically while also holding a flock, so dump and pull must not be interleaved. Points to note <ul> <li> rsync will append its arguments to the end of whatever command you specify in "--rsync-path", so command needs to cope with that, and for that I rely on bash shell features on both pulling and remote hosts.</li> <li>any pre and post processing on remote host must not write to STDOUT because that will corrupt rsync protocol and rsync will bail. Any error output should go to STDERR and it will turn up on pulling host as rsync STDERR output. This is why '1>&2' in all the error handling.</li> <li>this probably relies on remote command spawned by rsync to run by bash because I think the good old sh does not support arrays. This works for me between RHEL7 boxes. Possible work around proposed at the end.</li> </ul> With that in mind, here is my simplified concept only rehash (I've not run this particular script, my full solution has extra layers that distract attention from the main point). The script on the pulling host: <pre class="prettyprint"><code>#!/bin/bash function rsync_wrap() { { flock --exclusive --timeout ${LOCK_TIMEOUT} 100 || { echo "Failed to lock: ${LOCK_TIMEOUT}" 1>&2 return 1 } # call real rsync with original arguments rsync "$@" exit_code=$? if [ ${exit_code} -eq 0 ]; then # Do clean up when success # rm -f "${LOCK_FILE}" # rm -rf /eg/purge/data else # Do clean up when failed fi # Note, return is important, do not let it fall out return ${exit_code} } 100<"${LOCK_FILE}" echo "Failed to open lock file: ${LOCK_FILE}" 1>&2 return 1 } # Define vars LOCK_FILE=/var/somedir/name.lock; # or /dev/shm/name.lock LOCK_TIMEOUT=600; #in seconds # Build remote command, define vars and functions inside the command remote_cmd=" # this approach deals with crazy chars in variables and function code $( declare -p LOCK_FILE ) $( declare -p LOCK_TIMEOUT ) $( declare -f rsync_wrap ) rsync_wrap " local_cmd=( rsync -a --rsync-path="${remote_cmd}" # I want to handle network timeouts in SSH, not in rsync, # because rsync does not know that waiting for lock is expected -e "ssh -o BatchMode=yes -o ServerAliveCountMax=3 -o ServerAliveInterval=30 ${IDENTITY_FILE:+ -i '${IDENTITY_FILE}'}" /remote/source/path /local/destination/path/ ) # Do it "${local_cmd[@]}" </code></pre> If remote side executes --rsync-path in something other than bash then maybe the whole remote command could be wrapped in something like: <pre class="prettyprint"><code>local_cmd="bash -c '${local_cmd//\'/\'\\\'\'}'" </code></pre>

Using file locks with rsync

Tags:

rsync

From the rsync manual documentation I see that by using the option rsync-path, it is possible to specify what program is to be run on the remote machine to start up rsync. In particular, the program could be a wrapper script which calls the actual rsync command in the middle, but which does some actions before and/or after the rsync invocation. One possible interesting use would be to acquire/release a lock (e.g., a flock), so that the operations of rsync at the remote end could be co-ordinated with another process at the far end which is contending for write access to the same files. There could be multiple rsync processes simultaneously holding the shared lock (I am aware of potential for starvation but am not concerned about that right now). The 'writer' process I'm dealing with would just be changing a few hard-links, so it would not block the rsync process for any significant lengh of time.

I have looked at other co-ordination approaches, e.g., implementing a custom remote locking protocol between the client and server, but they all involve more development work and/or are unsatisfactory for other reasons, which is why I am interested in the wrapper/(f)lock approach.

My questions are:

1) Is this a reasonable way to solve the problem of co-ordinating rsync 'readers' with another, 'writer' process accessing the same directory?

2) Can you also put a wrapper around rsync when using the inetd (or xinetd) daemon approach to running rsync, by adding a line something like the following to /etc/inetd.conf (as per the rsyncd.conf man page):

rsync stream tcp nowait root /usr/bin/rsync rsyncd --daemon

but replacing /usr/bin/rsync with the path to your rsync-lookalike wrapper, which in this case would be a C/C++ -code program which seizes a lock, forks off rsync, waits for rsync to complete, then releases the lock.

Thanks, Tom

273

asked Nov 22 '13 08:11

TPJ

2 Answers

One potential catch with the wrapper approach: the remote process seems to be called with extra arguments, which are appended to whatever command line you specify with --rsync-path. So if you need to pass arguments something like the following style is needed.

#! /bin/sh

lock_target=$1
shift

if ! lockfile ${lock_target}.lock ; then exit 1 ; fi

trap "rm  -f ${lock_target}.lock" EXIT HUP TERM INT

/usr/bin/rsync "$@"

115

answered Oct 30 '22 00:10

BehemothTheCat

Thanks to the question and the comments. Armed with your ideas I solved it (for me) using --rsync-path but without any wrapper scrips on the remote host, simply by putting all payload script into --rsync-path, with a few tricks.

This particular example uses rsync to pull data from remote host while holding a flock on the remote host, e.g. remote host dumps data periodically while also holding a flock, so dump and pull must not be interleaved.

Points to note

rsync will append its arguments to the end of whatever command you specify in "--rsync-path", so command needs to cope with that, and for that I rely on bash shell features on both pulling and remote hosts.
any pre and post processing on remote host must not write to STDOUT because that will corrupt rsync protocol and rsync will bail. Any error output should go to STDERR and it will turn up on pulling host as rsync STDERR output. This is why '1>&2' in all the error handling.
this probably relies on remote command spawned by rsync to run by bash because I think the good old sh does not support arrays. This works for me between RHEL7 boxes. Possible work around proposed at the end.

With that in mind, here is my simplified concept only rehash (I've not run this particular script, my full solution has extra layers that distract attention from the main point).

The script on the pulling host:

#!/bin/bash

function rsync_wrap() {
  {
    flock --exclusive --timeout ${LOCK_TIMEOUT} 100 || {
      echo "Failed to lock: ${LOCK_TIMEOUT}" 1>&2
      return 1
    }

    # call real rsync with original arguments
    rsync "$@"

    exit_code=$?

    if [ ${exit_code} -eq 0 ]; then
      # Do clean up when success
      # rm -f "${LOCK_FILE}"
      # rm -rf /eg/purge/data
    else
      # Do clean up when failed
    fi

    # Note, return is important, do not let it fall out
    return ${exit_code}

  } 100<"${LOCK_FILE}"

  echo "Failed to open lock file: ${LOCK_FILE}" 1>&2
  return 1
}

# Define vars
LOCK_FILE=/var/somedir/name.lock; # or /dev/shm/name.lock
LOCK_TIMEOUT=600; #in seconds

# Build remote command, define vars and functions inside the command
remote_cmd="
  # this approach deals with crazy chars in variables and function code
  $( declare -p LOCK_FILE )
  $( declare -p LOCK_TIMEOUT )
  $( declare -f rsync_wrap )

  rsync_wrap "

local_cmd=(
  rsync
  -a
  --rsync-path="${remote_cmd}"
  # I want to handle network timeouts in SSH, not in rsync,
  # because rsync does not know that waiting for lock is expected
  -e "ssh -o BatchMode=yes -o ServerAliveCountMax=3 -o ServerAliveInterval=30 ${IDENTITY_FILE:+ -i '${IDENTITY_FILE}'}"
  /remote/source/path
  /local/destination/path/
)

# Do it
"${local_cmd[@]}"

If remote side executes --rsync-path in something other than bash then maybe the whole remote command could be wrapped in something like:

local_cmd="bash -c '${local_cmd//\'/\'\\\'\'}'"

answered Oct 30 '22 00:10

AnyDev

Related questions
                            
                                Vagrant with Docker Provider fails with rsync over ssh
                            
                                Is it possible to sync just ONE file with lsyncd?
                            
                                Using rsync to remote SSH user with no shell access
                            
                                "filedescriptor out of range in select()" when using python's subprocess with rsync
                            
                                Sync without scanning individual files? [closed]
                            
                                Using RSync to copy a sequential range of files
                            
                                Rsync excluding everything except 1 directory tree
                            
                                Mirror folder from remote server in pure PHP
                            
                                How do you get rsync to exclude any directory named cache?
                            
                                Getting rsync in Ansible to work with Vagrant
                            
                                RSync single (archive) file that changes every time
                            
                                rsnapshot, multiple backup destinations
                            
                                bash: rsync with options as variable
                            
                                rsync remote files over SSH to my local machine, using sudo privileges on local side, and my personal SSH key
                            
                                Rsync to Google Compute engine Instance from Jenkins
                            
                                How to use rsync instead of move in for loop in batch
                            
                                jar file to use rsync for uploading and downloading files on linux server
                            
                                Rsync: pure Ruby implementation?
                            
                                Vagrant rsync is super slow
                            
                                Rsync not deleting [closed]

Donate For Us

If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!

Donate Us With