Here is my benchmark program:
extern crate zip;
use std::fs::File;
use std::io::copy;
use zip::write::FileOptions;
use zip::ZipWriter;
fn main() {
    let mut src = File::open("/tmp/src.mxf").unwrap(); // 624 Mb file.
    let dest = File::create("/tmp/test.zip").unwrap();
    let mut zip_writer = ZipWriter::new(dest);
    zip_writer
        .start_file("src.mxf", FileOptions::default())
        .unwrap();
    copy(&mut src, &mut zip_writer).unwrap();
    zip_writer.finish().unwrap();
}
With the program compiled in release mode:
time ./zip_bench
./zip_bench  62,68s user 146,21s system 99% cpu 3:28,91 total
The same file compressed using the system zip binary:
time zip /tmp/test2.zip /tmp/src.mxf
zip /tmp/test2.zip /tmp/src.mxf  13,77s user 0,19s system 99% cpu 13,965 total
The time factor between the system and Rust zip is around 14x (for a similar output file, with insignificant size difference).
Am I doing something wrong in the code that could explain Rust performance? How can I improve it to approach system performance?
I don't have your test data, so I'm operating on a 3.7 GB Debian DVD ISO. I'm also assuming whatever you're calling the "system zip" is roughly the same as the Arch zip package.
Starting with your original code, updates to the zip crate such as moving to flate2 over deflate help:
time ./zipbench 
real    2m29.285s
user    2m23.396s
sys     0m4.066s
time zip test2.zip  debian-10.4.0-amd64-DVD-1.iso 
  adding: debian-10.4.0-amd64-DVD-1.iso (deflated 1%)
real    1m42.709s
user    1m38.066s
sys     0m3.386s
The zip utility is only about twice as fast, and we haven't even changed our code yet, just updated our crates and Rust proper by about a year.
We can add buffered IO to our Rust with BufReader and BufWriter:
fn main() -> io::Result<()> {
    let mut src = BufReader::new(File::open("./debian-10.4.0-amd64-DVD-1.iso")?);
    dest = BufWriter::new(File::create("./test.zip")?);
    let mut zip_writer = ZipWriter::new(dest);
    zip_writer.start_file("src.mxf", FileOptions::default())?;
    // This is only workable because we're only writing one file to our ZIP.
    let mut zip_writer = BufWriter::new(zip_writer);
    io::copy(&mut src, &mut zip_writer)?;
    Ok(())
}
This earns us a small performance bump, but not a huge amount:
time ./zipbench
real    2m25.348s
user    2m20.105s
sys     0m3.894s
You might get a bit more speed by using flate2 directly, especially if you can use CloudFlare's Zlib fork. However, I haven't tested this.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With