Assume 1000 files with extension .xhtml are in directory input, and that a certain subset of those files (with output paths in $(FILES), say) need to be transformed via xslt to files with the same name in directory output. A simple make rule would be:
$(FILES): output/%.xhtml : input/%.xhtml
saxon s:$< o:$@ foo.xslt
This works, of course, doing the transform one file at a time. The problem is that I want to use saxon's batch processing to do all the files at one time, since, given the number of files, that would be much faster, considering the overhead of loading java and saxon for each file. Saxon allows the -s (source) option to be a directory and processes all files in that directory, placing the results with the same name in the directory specified in the -o: option.
I'm aware of the well-known technique to get GNU make to do a single command to update multiple files by using pattern rules:
output/%.xhtml: input/%.xhtml
saxon s:input -o:output foo.xslt
But in my case this suffers from two problems. First, it will run the transform on all files in the input directory, not just the ones that have changed; and second, it will not limit the transform to the subset of files specified in $(FILES). The GNU make feature of running a recipe given in a pattern rule only once for all matched targets does not work in the case of so-called "static pattern rules" (see [here]), as the rule given at the top of the post is known.
In order to use the saxon batching feature, I need to create a temporary directory, copy to it only those files to be processed, then run the transform with that temporary directory as the input directory. I tried creating a temporary directory, and remember its name using a target-specific variable for future use, using
$(FILES): TMPDIR:=$(shell mktemp -d)
but this creates a new temporary directory for every single target that is out-of-date. In any case, I'm not sure how to structure the rule that would then copy the necessary files into that directory. I don't want to create the temporary directory at the time the makefile is parsed, since I have a non-recursive make system that will parse all make files, even those not related to the current top-level target, and don't want to create the temporary directory for situations in which it is not necessary/will not be used.
I'm well aware that many questions have been asked on SO in the past about creating multiple files from a single input; one solution is (non-static) pattern rules; other solutions involve phony targets. However, in this case I'm stuck as to how to put all this together.
I can identify the files that changed and copy them using the static pattern rule
$(FILES): output/%.xhtml : input/%.xhtml
TMPDIR=`mktemp -d`
cp $< $(TMPDIR)
but actually I would prefer to copy the files with a single cp command, whereas this copies them one by one. Perhaps there is some application here of cp -u?
I also considered using an ad-hoc extension for those files needing updating but could not see how to get this to work either. I'm about to give up and just run the saxon transform on all files when any of them have changed, but is there any better way?
Personally, I wouldn't try to do this from the command line. That's partly because I'm not a shell scripting wizard. I'm not an Ant wizard either, but because the requirement is to process files that haven't changed, this seems to fall very much into Ant territory. On the other hand, Ant will recompile the stylesheet for each transformation, which is an overhead you might want to avoid; if that's the case then your best bet is probably to write a little Java application. It's probably only 100 lines or less.
Final possibility is to do the processing within Saxon: that is, a single transformation that reads multiple input files using the collection() function and generates multiple result files using xsl:result-document. Saxon (commercial editions) offers an extension function last-modified that allows you to filter the files to be processed. With 1000 files you might also want the extension function saxon:discard-document() to prevent the heap filling.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With