Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How can I use rsync to backup files changed within a recent period?

Is it possible to specify a time range so that rsync only operates on recently changed files.

I'm writing a script to backup recently added files over SSH and rsync seems like an efficient solution. My problem is that my source directories contain a huge backlog of older files which I have no interest in backing up.

The only solution I've come across so far is doing a find with ctime to generate a --files-from file. This works, but I have to deal with some old installations with versions of rsync that don't support --files-from. I'm considering generating --include-from patterns in the same way but would love to find something more elegant.

like image 801
Ken Avatar asked Jun 03 '09 15:06

Ken


People also ask

What happens if files change during rsync?

rsync first scans the files and builds a list. so once the file is listed for sync, rsync will sync the latest change of file. but if the file is not in the list of files to be synced, which was built before starting the sync operation, then it will not sync it.

Does rsync Skip existing files?

Rsync with --ignore-existing-files: We can also skip the already existing files on the destination. This can generally be used when we are performing backups using the –link-dest option, while continuing a backup run that got interrupted. So any files that do not exist on the destination will be copied over.

Does rsync keep files in sync?

Also, rsync provides the ability to synchronize a directory structure (or even a single file) with another destination, local or remote.

Does rsync recursive?

Rsync OptionsAllows to sync data recursively but does not keep ownership for users and groups, permissions, timestamps, or symbolic links. The archive mode behaves like the recursive mode but keeps all file permissions, symbolic links, file ownership, etc. Used to compress data during transfers to save space.


2 Answers

It looks like you can specify shell commands in the arguments to rsync (see Remote rsync executes arbitrary shell commands)

so I have been able to successfully limit the files that rsync looks at by using:

rsync -av remote_host:'$(find logs -type f -ctime -1)' local_dir

This looks for any files changed in the last day (-ctime -1) and then rsyncs those into local_dir.

I'm not sure if this feature is by design but I'm still digging into the documentation.

like image 161
Ken Avatar answered Nov 15 '22 17:11

Ken


Why not just take the heat on backing up the whole directory once and take advantage of the incremental backing up provided by rsync and rdiff and its cousins, you won't waste diskspace where they are backed up to because they'll be perpetually unchanged.

Backing up the whole thing is simpler, and has substantially less risk for errors. Trying to selectively backup some files and not others is a recipe for not backing up what you need without realizing it, then getting burned when you can't restore a critical file.

Otherwise you should reorganize your source directory so there is less 'decision making' in your backup script.

like image 30
whatsisname Avatar answered Nov 15 '22 18:11

whatsisname