Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How do I process extremely long lists of files in a make recipe?

Tags:

makefile

Because GNU make allows variables to be as large as memory allows, it has no problem building massive dependency lists. However, if you want to actually use these lists of files in a recipe (sequence of shell commands for building a target), you run into a problem: the command might exceed the shell's command line length limit, producing an error such as "Argument list too long".

For example, suppose I want to concatenate several files contained in the list $(INPUTS) to produce a file combined.txt. Ordinarily, I could use:

combined.txt: $(INPUTS)
        cat $^ > $@

But if $(INPUTS) contains many thousands of files, as it does in my case, the call to cat is too long and fails. Is there a way to get around this problem in general? It's safe to assume that there exists some sequence of commands that have identical behaviour to the one enormous command -- in this case, a series of cat commands, one per input file, that use >> to append to combined.txt would work. But how can make be persuaded to generate those commands?

like image 662
j_random_hacker Avatar asked Aug 12 '11 12:08

j_random_hacker


People also ask

Does order matter in make file?

The order of rules is not significant, except for determining the default goal: the target for make to consider, if you do not otherwise specify one. The default goal is the target of the first rule in the first makefile. If the first rule has multiple targets, only the first target is taken as the default.

How can I get PWD in makefile?

How can I get PWD in makefile? You can use shell function: current_dir = $(shell pwdpwdIn Unix-like and some other operating systems, the pwd command (print working directory) writes the full pathname of the current working directory to the standard output.


1 Answers

In looking for the answer, about the best suggestion I could find was to break up the list into a series of smaller lists and process them using shell for loops. But you can't always do that, and even when you can it's a messy hack: for example, it's not obvious how to get the usual make behaviour of stopping as soon as a command fails. Luckily, after much searching and experimentation, it turns out that a general solution does exist.

Subshells and newlines

make recipes invoke a separate subshell for each line in the recipe. This behaviour can be annoying and counterintuitive: for example, a cd command on one line will not affect subsequent commands because they are run in separate subshells. Nevertheless it's actually what we need to get make to perform actions on very long lists of files.

Ordinarily, if you build a "multiline" list of files with a regular variable assignment that uses backslashes to break the statement over multiple lines, make removes all newlines:

# The following two statements are equivalent
FILES := a b c

FILES := \
a \
b \
c

However, using the define directive, it's possible to build variable values that contain newlines. What's more, if you substitute such a variable into a recipe, each line will indeed be run using a separate subshell, so that for example running make test from /home/jbloggs with the makefile below (and assuming no file called test exists) will produce the output /home/jbloggs, because the effect of the cd .. command is lost when its subshell ends:

define CMDS
cd ..
pwd
endef

test:
        $(CMDS)

If we create a variable that contains newlines using define, it can be concatenated with other text as usual, and processed using all the usual make functions. This, combined with the $(foreach) function, allows us to get what we want:

# Just a single newline!  Note 2 blank lines are needed.
define NL


endef

combined.txt: $(INPUTS)
        rm $@
        $(foreach f,$(INPUTS),cat $(f) >> $@$(NL))

We ask $(foreach) to convert each filename into a newline-terminated command, which will be executed in its own subshell. For more complicated needs, you could instead write out the list of filenames to a file with a series of echo commands and then use xargs.

Notes

  • The define directive is described as optionally taking a =, := or += token on the end of the first line to determine which variable flavour is to be created -- but note that that only works on versions of GNU make 3.82 and up! You may well be running the popular version 3.81, as I was, which silently assigns nothing to the variable if you add one of these tokens, leading to much frustration. See here for more.
  • All recipe lines must begin with a literal tab character, not the 8 spaces I have used here.
like image 51
j_random_hacker Avatar answered Oct 04 '22 14:10

j_random_hacker