Looking for a way to read in a file from a tar.gz archive using the Nim programming language (version 0.11.2). Say I have an archive
/my/path/to/archive.tar.gz
and a file in that archive
my/path/to/archive/file.txt
My goal is to be able to read the contents of the file line by line in Nim. In Python I can do this with the tarfile module. In Nim there are the libzip and zlib modules, but the documentation is minimal and there are no examples. There's also the zipfiles module, but I'm not sure if this is capable of working with tar.gz archives.
In a project at my company, we've been using the following module, exposing gzip files as streams:
import
zlib, streams
type
GZipStream* = object of StreamObj
f: GzFile
GzipStreamRef* = ref GZipStream
proc fsClose(s: Stream) =
discard gzclose(GZipStreamRef(s).f)
proc fsReadData(s: Stream, buffer: pointer, bufLen: int): int =
return gzread(GZipStreamRef(s).f, buffer, bufLen)
proc fsAtEnd(s: Stream): bool =
return gzeof(GZipStreamRef(s).f) != 0
proc newGZipStream*(f: GzFile): GZipStreamRef =
new result
result.f = f
result.closeImpl = fsClose
result.readDataImpl = fsReadData
result.atEndImpl = fsAtEnd
# other methods are nil!
proc newGZipStream*(filename: cstring): GZipStreamRef =
var gz = gzopen(filename, "r")
if gz != nil: return newGZipStream(gz)
But you also need to to be able to read the tar header in order to find the correct location of the desired file in the uncompressed gzip stream. You could wrap some existing C library like libtar to do this, or you could roll your own implementation.
To my knowledge, libzip and zlib cannot be used to read tar files (afaik they only support zip archives and/or raw string compression, while a tar.gz requires gzip + tar). Unfortunately it looks like there are no Nim libraries yet which read tar.gz archives.
If you are okay with a quick-and-dirty tar
-based solution, you can do this:
import osproc
proc extractFromTarGz(archive: string, filename: string): string =
# -z extracts
# -f specifies filename
# -z runs through gzip
# -O prints to STDOUT
result = execProcess("tar -zxf " & archive & " " & filename & " -O")
let content = extractFromTarGz("test.tar.gz", "some/subpath.txt")
If you want a clean and flexible solution, this would be a good opportunity to write a wrapper for the libarchive library ;).
I created a basic untar
package that may help with this: https://github.com/dom96/untar
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With