I have some VM Images that need to synced everyday. The VM files are sparse'd.
To save network traffic i only want to transfer the real datas of the images. I try it with --sparse option at rsync but on network traffic i see that the whole size get transfered over network and not only the real data usage.
If i use rsync -zv --sparse then only the real size get transmitted over network and everything is ok. But i dont want to compression the file because of the cpu usage.
Shouldnt the --sparse option transfer only real datas and the "null datas" get created locally to save network traffic?
Is there a workaround without compression?
Thanks!
Take a look a this discussion, specifically, this answer.
It seems that the solution is to do a rsync --sparse
followed by a rsync --inplace
.
On the first, --sparse
, call, also use --ignore-existing
to prevent already transferred sparse files to be overwritten, and -z
to save network resources.
The second call, --inplace
, should update only modified chunks. Here, compression is optional.
Also see this post.
Update
I believe the suggestions above won't solve your problem. I also believe that rsync
is not the right tool for the task. You should search for other tools which will give you a good balance between network and disk I/O efficiency.
Rsync
was designed for efficient usage of a single resource, the network. It assumes reading and writing to the network is much more expensive than reading and writing the source and destination files.
We assume that the two machines are connected by a low-bandwidth high-latency bi-directional communications link. The rsync algorithm, abstract.
The algorithm, summarized in four steps.
Notice that rsync
normally reconstructs the file B as a temporary file T, then replaces B with T. In this case it must write the whole file.
The --inplace
does not relieve rsync
from writing blocks matched by α, as one could imagine. They can match at different offsets. Scanning B a second time to take new data checksums is prohibitive in terms of performance. A block that matches in the same offset it was read on step one could be skipped, but rsync
does not do that. In the case of a sparse file, a null block of B would match for every null block of A, and would have to be rewritten.
The --inplace
just causes rsync
to write directly to B, instead of T. It will rewrite the whole file.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With