Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

g++: Use ZIP files as input

We have the Boost library in our side. It consists of a huge number of files which never change and only a tiny portion of it is used. We swap the whole boost directory if we are changing versions. Currently we have the Boost sources in our SVN, file by file which makes the checkout operations very slow, especially on Windows.

It would be nice if there were a notation / plugin to address C++ files inside ZIP files, something like:

// @ZIPFS ASSIGN 'boost' 'boost.zip/boost'
#include <boost/smart_ptr/shared_ptr.hpp>

Are there any support for compiler hooks in g++? Are there any effort regarding ZIP support? Other ideas?

like image 671
Notinlist Avatar asked Jun 15 '12 13:06

Notinlist


People also ask

Can Google Drive handle ZIP files?

Files by Google allows you to extract and view contents of compressed files. Note: Only . zip files are supported.

How do I import a ZIP file into Python?

If you want to import modules and packages from a ZIP file, then you just need the file to appear in Python's module search path. The module search path is a list of directories and ZIP files. It lives in sys.


5 Answers

I assume that make or a similar buildsystem is involved in the process of building your software. I'd put the zip file in the repository, and add a rule to the Makefile to extract it before the actual build starts.

For example, suppose your zip file is in the source tree at "external/boost.zip", and it shall be extracted to "external/boost", and it contains at its toplevel a file "boost_version.h".

# external/Makefile
unpack_boost: boost/boost_version.h

boost/boost_version.h: boost.zip
    unzip $<

I don't know the exact syntax of the unzip call, ask your manpage about this.

Then in other Makefiles, you can let your source files depend on the unpack_boost target in order to have make unpack Boost before a source file is compiled.

# src/Makefile (excerpt)
unpack_boost:
    make -C ../external unpack_boost

source_file.cpp: unpack_boost

If you're using a Makefile generator (or an entirely different buildsystem), please check the documentation for these programs for how to create something like the custom target unpack_boost. For example, in CMake, you can use the add_custom_command directive.

The fine print: The boost/boost_version.h file is not strictly necessary for the Makefile to work. You could just put the unzip command into the unpack_boost target, but then the target would effectively be phony, that is: it would be executed during each build. The file inbetween (which of course you need to replace by a file which is actually present in the zip archive) ensures that unzip only runs if necessary.

like image 148
Stefan Majewsky Avatar answered Oct 25 '22 03:10

Stefan Majewsky


A year ago I was in the same position as you. We kept our source in SVN and, even worse, included boost in the same repository (same branch) as our own code. Trying to work on multiple branches was impossible, as it would take most of a day to check-out a fresh working copy. Moving boost into a separate vendor repository helped, but it would still take hours to check-out.

I switched the team over to git. To give you an idea of how much better it is than SVN, I have just created a repository containing the boost 1.45.0 release, then cloned it over the network. (Cloning copies all of the repository history, which in this case is a single commit, and creates a working copy.)

That clone took six minutes.

In the first six seconds a compressed copy of the repository was copied to my machine. The rest of the time was spent writing all of those tiny files.

I heartily recommend that you try git. The learning curve is steep, but I doubt you'll get much pre-compiler hacking done in the time it would take to clone a copy of boost.

like image 36
RobH Avatar answered Oct 25 '22 03:10

RobH


We've been facing similar issues in our company. Managing boost versions in build environments is never going to be easy. With 10+ developers, all coding on their own system(s), you will need some kind of automation.

First, I don't think it's good idea to store copies of big libraries like boost in SVN or any SCM system for that matter, that's not what those systems are designed for, except if you plan to modify code in boost yourself. But let's assume you're not doing that.

Here's how we manage it now, after trying lots of different methods, this works best for us.

For every version of boost that we use, we put the whole tree (unzipped) on a file server and we add extra subdirectories, one for each architecture/compiler-combination, where we put the compiled libraries. We keep copies of these trees on every build system and in the global system environment we add variables like:

BOOST_1_48=C:\boost\1.48 # Windows environment var

or

BOOST_1_48=/usr/local/boost/1.48 # Linux environment var, e.g. in /etc/profile.d/boost.sh

This directory contains the boost tree (boost/*.hpp) and the added precompiled libs (e.g. lib/win/x64/msvc2010/libboost_system*.lib, ...)

All build configurations (vs solutions, vs property files, gnu makefiles, ...) define an internal variable, importing the environment vars, like:

BOOSTROOT=$(BOOST_1_48) # e.g. in a Makefile, or an included Makefile

and further build rules all use the BOOSTROOT setting for defining include paths and library search paths, e.g.

CXXFLAGS += -I$(BOOSTROOT)
LFLAGS   += -L$(BOOSTROOT)/lib/linux/x64/ubuntu/precise
LFLAGS   += -lboost_date_time

The reason for keeping local copies of boost is compilation speed. It takes up quite a bit of disk space, especially the compiled libs, but storage is cheap and a developer losing lots of time compiling code is not. Plus, this only needs to be copied once.

The reason for using global environment vars is that build configurations are transferrable from one system to another, and can thus be safely checked in to your SCM system.

To smoothen things a bit, we've developed a little tool that takes care of the copying and setting the global environment. With a CLI, this can even be included in the build process.

Different working environments mean different rules and cultures, but believe me, we've tried lots of things and finally, we decided to define some kind of convention. Maybe ours can inspire you...

like image 38
Pat Avatar answered Oct 25 '22 02:10

Pat


This is something you would not do in g++, because any other application that wants to do it would also have to be modified.

Store the files on a compressed filesystem. Then every application gets the benefit automatically.

like image 27
stark Avatar answered Oct 25 '22 03:10

stark


It should be possible in an OS to allow transparent access to files inside a ZIP file. I know that I put it in the design of my own OS a long time ago (2004 or so) but never got it to a point where it was usable. The downside is that seeking backwards in a file inside a ZIP is slower as it's compressed (and you can't rewind the compressor state, so you have to seek from the start instead). This also makes using a zip-inside-a-zip slow for rewinding and reading. Fortunately, most cases just read a file sequentially.

It should also be retrofittable to current OSes, at least in client space. You can hook the filesystem access functions used (fopen, open, ...) and add a set of virtual file descriptors that your own software would return for a given filename. If it's a real file just pass it on, if it's not open the underlying file (possibly again via this very function) and pass a virtual handle. When accessing the file contents, read directly from the zip file without caching.

On Linux you would use an LD_PRELOAD to inject it into existing software (at usage time), on Windows you can hook the system calls or inject a DLL into the space of software to hook the same functions.

Does anybody know if this already exists? I can't see any clear reason it wouldn't...

like image 43
dascandy Avatar answered Oct 25 '22 01:10

dascandy