I have a big gzip
file and I would like to read only parts of it using seek
.
About the use of seek
on gzip
files, this page says:
The seek() position is relative to the uncompressed data, so the caller does not even need to know that the data file is compressed.
Does this imply that seek
has to read and decompress the data from the beginning of the file to the target position?
Yes. This is the code:
elif self.mode == READ:
if offset < self.offset:
# for negative seek, rewind and do positive seek
self.rewind()
count = offset - self.offset
for i in range(count // 1024):
self.read(1024)
self.read(count % 1024)
Alternatives are discussed here. The problem is inherent to the gzip
format.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With