Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to convert jar to rsyncable jar?

I have a fat/uber JAR generated by Gradle Shadow plugin. I often need to send the fat JAR over network and therefore, it is convenient for me to send only delta of the file instead of cca 40 MB of data. rsync is a great tool for this purpose. However, a small change in my source code leads to a large change in final fat JAR and consequently rsync is not helping as much as it could.

Can I convert the fat JAR to rsync-friendly JAR?

My ideas of a solution/workarounds:

  • Put the heavy weight on rsync and tell it somehow that it works with a compressed file (I didn't find any way to do it).
  • Convert non-rsyncable jar to rsyncable jar
  • Tell Gradle Shadow to generate rsyncable jar (not possible at the moment)

Possibly related questions:

  • Creating '--rsyncable' maven assembly
  • https://superuser.com/questions/482758/rsync-friendly-gzip
like image 961
Martin Vseticka Avatar asked Apr 04 '16 06:04

Martin Vseticka


1 Answers

There are two ways to do this both of which involve turning compression off. Gradle first then turn it off using the jar method...

You can do this using gradle (this answer actually came from the OP)

shadowJar {
    zip64 true
    entryCompression = org.gradle.api.tasks.bundling.ZipEntryCompression.STORED
    exclude 'META-INF/*.RSA', 'META-INF/*.SF','META-INF/*.DSA'
    manifest {
        attributes 'Main-Class': 'com.my.project.Main'
    }
}

with

jar {
    manifest {
        attributes(
                'Main-Class': 'com.my.project.Main',
        )
    }
}

task fatJar(type: Jar) {
    manifest.from jar.manifest
    classifier = 'all'
    from {
        configurations.runtime.collect { it.isDirectory() ? it : zipTree(it) }
    } {
        exclude "META-INF/*.SF"
        exclude "META-INF/*.DSA"
        exclude "META-INF/*.RSA"
    }
    with jar
}

The key thing here is that compression has been turned off ie

org.gradle.api.tasks.bundling.ZipEntryCompression.STORED

You can find the docs here

https://docs.gradle.org/current/javadoc/org/gradle/api/tasks/bundling/ZipEntryCompression.html#STORED

Yes you can speed it up by about 40% on a new archive and by more than 200% on a jar archive you've already rsync'd. The trick is to not compress the jar so you can take advantage of rsyncs chunking algorithm.

I used the following commands to compress a directory with a lot of class files...

jar cf0 uncompressed.jar .
jar cf  compressed.jar   .

This created the following two jars...

-rw-r--r--  1 rsync jar    28331212 Apr 13 14:11 ./compressed.jar
-rw-r--r--  1 rsync jar    38746054 Apr 13 14:10 ./uncompressed.jar

Note that the size of the uncompressed Jar is about 10MB larger.

I then rsync'd these files and timed them using the following commands. (Note, even turning on compression for the compressed file had little effect, I'll explain later).

Compressed Jar

time rsync -av -e ssh compressed.jar [email protected]:/tmp/

building file list ... done
compressed.jar

sent 28334806 bytes  received 42 bytes  2982615.58 bytes/sec
total size is 28331212  speedup is 1.00

real  0m9.208s
user  0m0.248s
sys 0m0.483s

Uncompressed Jar

time rsync -avz -e ssh uncompressed.jar [email protected]:/tmp/

building file list ... done
uncompressed.jar

sent 11751973 bytes  received 42 bytes  2136730.00 bytes/sec
total size is 38746054  speedup is 3.30

real  0m5.145s
user  0m1.444s
sys 0m0.219s

We have gained a speedup of nearly 50%. This at least speeds up the rsync and we get a good boost but what about subsequent rsyncs where a small change has been made.

I removed one class file from the directory that was 170 bytes in size recreated the jars mow they are this size..

-rw-r--r--  1 rsycn jar  28330943 Apr 13 14:30 compressed.jar
-rw-r--r--  1 rsync jar  38745784 Apr 13 14:30 uncompressed.jar

Now the timings are very different.

Compressed Jar

building file list ... done
compressed.jar

sent 12166657 bytes  received 31998 bytes  2217937.27 bytes/sec
total size is 28330943  speedup is 2.32

real  0m5.435s
user  0m0.378s
sys 0m0.335s

Uncompressed Jar

building file list ... done
uncompressed.jar

sent 220163 bytes  received 43624 bytes  175858.00 bytes/sec
total size is 38745784  speedup is 146.88

real  0m1.533s
user  0m0.363s
sys 0m0.047s

So we can speed up rsyncing large jar files a lot using this method. The reason for this is related to information theory. When you compress data it in effect removes everything that's common from the data ie what you're left with looks very much like random data, the best compressors remove more of this information. A small change to any of the data and most compression algorithms have a dramatic effect on the output of the data.

The Zip algorithm is effectively making it harder for rsync to find checksums that are the same between the server and client and this means it needs to transfer more data. When you uncompress it you're letting rsync do what it's good at, send less data to sync the two files.

like image 71
Harry Avatar answered Oct 02 '22 05:10

Harry