In the PHP manual there is a comment on gzdeflate saying:
gzcompress produces longer data because it embeds information about the encoding onto the string. If you are compressing data that will only ever be handled on one machine, then you don't need to worry about which of these functions you use. However, if you are passing data compressed with these functions to a different machine you should use gzcompress.
and then
running 50000 repetitions on various content, i found that gzdeflate() and gzcompress() both performed equally fast regardless content and compression level, but gzinflate() was always about twice as fast as gzuncompress().
For my purpose I am archiving data on a machine for future use. The data is read often, but written only once. In theory it will one day be moved onto another machine, if I change servers at some point, but that is a few years down the road.
Is it safe for me to use gzdeflate and gzinflate as opposed to gzcompress and gzuncompress?
My thinking is as follows: gzinflate is faster and this will help the server a lot since there will be lots of read requests. If at some point in the future I can't read the file then I should be able to figure out how to decompress the file and recompress it, right? It is not that the gzinflate will just magically not work one day, like the first comment appears to be saying. Even missing a 6 byte header I'm sure that it'll be expandable somehow.
Thoughts?
UPDATE -- Benchmark
10,000 iterations each:
gzdeflate took 19.158888816833 seconds and size 18521
gzinflate took 1.4803981781006 seconds
gzcompress took 19.376484870911 seconds and size 18527
gzuncompress took 1.6339199542999 seconds
gzencode took 20.015944004059 seconds and size 18539
gzdecodetook 1.8822891712189 seconds
The comment is nonsense. You can use any of gzcompress
, gzdeflate
, or gzencode
to produce compressed data that can be portably decompressed anywhere. Those functions only differ in the wrapper around the deflate data (RFC 1951). gzcompress
has a zlib wrapper (RFC 1950), gzdeflate
has no wrapper, and gzencode
has a gzip wrapper (RFC 1952).
I would recommend not using gzdeflate
, since no wrapper means no integrity check. gzdeflate
should only be used when some other wrapper is being generated outside of that, e.g. for zip files, which also use the deflate format. The comment about speed is almost certainly false. The integrity check of gzuncompress()
takes very little time compared to the decompression. You should do your own tests.
From this one example I might be overgeneralizing, but I would say that you should completely ignore the comments in the PHP documentation. They are, to be generous, uninformed.
By the way, these functions are named in a horribly confusing way. Only gzencode
should have "gz
" in the name, since that is the only one of those that actually deals in the .gz
format. gzcompress
sounds like it compresses to the gzip format, but in fact it compresses to the zlib format.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With