Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Monitoring Rsync Progress

I'm trying to write a Python script which will monitor an rsync transfer, and provide a (rough) estimate of percentage progress. For my first attempt, I looked at an rsync --progress command and saw that it prints messages such as:

1614 100%    1.54MB/s    0:00:00 (xfer#5, to-check=4/10)

I wrote a parser for such messages, and used the to-check part to produce a percentage progress, here, this would be 60% complete.

However, there are two flaws in this:

  • In large transfers, the "numerator" of the to-check fraction doesn't seem to monotonically decrease, so the percentage completeness can jump backwards.
  • Such a message is not printed for all files, meaning that the progress can jump forwards.

I've had a look at other alternatives of messages to use, but haven't managed to find anything. Does anyone have any ideas?

Thanks in advance!

like image 467
paulmdavies Avatar asked Aug 23 '11 08:08

paulmdavies


People also ask

Does rsync check for changes?

3. --checksum, -c This changes the way rsync checks if the files have been changed and are in need of a transfer. Without this option, rsync uses a "quick check" that (by default) checks if each file's size and time of last modification match between the sender and receiver.

Does rsync continue where it left off?

We can easily resume partially transferred files over SSH using Rsync. It helps you to resume the interrupted copy or download process where you left it off.

Does rsync do a checksum?

As hinted at by uʍop ǝpısdn's answer, rsync -c or rsync --checksum may do what you need. This forces the sender to checksum every regular file using a 128-bit MD4 checksum. It does this during the initial file-system scan as it builds the list of all available files.

How do I speed up rsync?

Another way to save network bandwidth and speed up transfers is to use compression, by adding -z as a command line option.


3 Answers

The current version of rsync (at the time of editing 3.1.2) has an option --info=progress2 which will show you progress of the entire transfer instead of individual files.

From the man page:

There is also a --info=progress2 option that outputs statistics based on the whole transfer, rather than individual files. Use this flag without outputting a filename (e.g. avoid -v or specify --info=name0 if you want to see how the transfer is doing without scrolling the screen with a lot of names. (You don't need to specify the --progress option in order to use --info=progress2.)

So, if possible on your system you could upgrade rsync to a current version which contains that option.

like image 82
cnelson Avatar answered Oct 01 '22 14:10

cnelson


You can disable the incremental recursion with the argument --no-inc-recursive. rsync will do a pre-scan of the entire directory structure, so it knows the total number of files it has to check.

This is actually the old way it recursed. Incremental recursion, the current default, was added for speed.

like image 43
Izkata Avatar answered Oct 01 '22 16:10

Izkata


Note the caveat here that even --info=progress2 is not entirely reliable since this is percentage based on the number of files rsync knows about at the time when the progress is being displayed. This is not necessarily the total number of files that needed to be sync'd (for instance, if it discovers a large number of large files in a deeply nested directory).

One way to ensure that --info=progress2 doesn't jump back in the progress indication would be to force rsync to scan all the directories recursively before starting the sync (instead of its default behavior of doing an incrementally recursive scan), by also providing the --no-inc-recursive option. Note however that this option will also increase rsync memory usage and run-time.

like image 24
lonetwin Avatar answered Oct 01 '22 14:10

lonetwin