Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

curl to tar to zip with pipes

Tags:

bash

curl

zip

tar

I am wanting to download a tar.gz archive, extract it and compress it into a zip file in one command with a bash script. Reason for this is to be independent from temporary files.

The code I use:

curl -L "someURL" | tar xOz --strip-components=1 | zip -@ test.zip

gives a lot of output to STDOUT so I guess zip is not accepting the pipe.

Maybe I am missing something out here, but the man page of zip doesn't give me more information than using -@ or - nor does the internet.

like image 687
Flatron Avatar asked Sep 15 '14 15:09

Flatron


People also ask

Can I use a pipe to tar a file?

You're writing the downloaded data to a file, so you're not actually piping anything to tar. Pipes are only useful if you want the standard output of one program to become the standard input of another. Here, you are downloading a file and then want to open it with another tool, so pipes aren't useful.

How to make a bash pipe from tar to zip?

And it's hard to see how to make it work, because bash pipes are just unstructured strings, but to transmit the information from tar to zip you need to add some structure, even if it is minimal: [filename] [filedata] [filename] [filedata]... And the sender ( tar) and receiver ( zip) would have to agree on the format of that structure.

What is the difference between Tartar and ZIP?

tar is going to send all the file data to stdout (but no file names). zip can't possibly do much of anything sane with that (barring creating a giant zip blob of doom of all the file contents in a single zip file and I can't imagine you want that). You need to extract the files to disk if you want to create a zip archive of them.

Can I use interfaces to tar and zip files?

However, you can use interfaces to tar and zip other than the command-line utilities. For example, if you have python installed, the following should work: (Needs lots of error checking.


2 Answers

The manpage for zip says (at least on my system):

If a file list is specified as -@ [Not on MacOS], zip takes the list of input files from standard input instead of from the command line. For example,
zip -@ foo
will store the files listed one per line on stdin in foo.zip.

The manpage for tar

-O, --to-stdout
             extract files to standard output.

So, in short:

tar -O can output the files (but not their names) in one long stream to stdout. But zip expects a list of filenames on stdin. So that's not going to work. And it's hard to see how to make it work, because bash pipes are just unstructured strings, but to transmit the information from tar to zip you need to add some structure, even if it is minimal:

[filename][filedata][filename][filedata]...

And the sender (tar) and receiver (zip) would have to agree on the format of that structure. Which is not going to happen.

However, you can use interfaces to tar and zip other than the command-line utilities. For example, if you have python installed, the following should work:

#!/usr/bin/python
import sys
import tarfile
import zipfile
tarf = tarfile.open(sys.argv[1], "r:*")
zipf = zipfile.ZipFile(sys.argv[2], "w", zipfile.ZIP_DEFLATED)
for m in tarf:
  if m.isreg():
    zipf.writestr(m.path, tarf.extractfile(m).read())

(Needs lots of error checking. As written, it just crashes on any error.)

You can make that into a shell "one-very-long-liner" although personally I'd just use the python script above.

 python -c "$(printf %s \
   'import sys;import tarfile;import zipfile;' \
   'T=tarfile.open(sys.argv[1],"r:*")' \
   'Z=zipfile.ZipFile(sys.argv[2],"w",zipfile.ZIP_DEFLATED);' \
   '[Z.writestr(m.path,T.extractfile(m).read()) for m in T if m.isreg()]')" \
   input.tar output.zip

(If you want to pipe from curl into that, use /dev/stdin as the input file. I think that will avoid Python trying to interpret stdin as a UTF-8 stream.)

like image 140
rici Avatar answered Oct 02 '22 18:10

rici


tar is going to send all the file data to stdout (but no file names).

zip can't possibly do much of anything sane with that (barring creating a giant zip blob of doom of all the file contents in a single zip file and I can't imagine you want that).

You need to extract the files to disk if you want to create a zip archive of them.

I was going to say that you might be able to loop over the entries in the tarball (by name) and extract each one to the pipe (though that would be very costly in terms of number of times needed to scan through the tarball) but I don't actually see, in the man page for zip I have here at least, a way to get zip to compress data given to it via standard input. It only seems to take file names that way.

like image 26
Etan Reisner Avatar answered Oct 02 '22 17:10

Etan Reisner