I'm experimenting with using Shake to build Java code, and am a bit stuck because of the unusual nature of the javac compiler. In general for each module of a large project, the compiler is invoked with all of the source files for that module as input, and produces all of the output files in one pass. Subsequently we typically take the .class files produced by the compiler and assemble them into a JAR (basically just a ZIP).
For example, a typical Java module project is arranged as follows:
src
directory that contains multiple .java files, some of them nested many levels deep in a tree.bin
directory that contains the output from the compiler. Typically this output follows the same directory structure and filenames, with .class
substituted for each .java file, but the mapping is not necessarily one-to-one: a single .java file can produce zero to many .class files!The rules I would like to define in Shake are therefore as follows:
1) If any file under src
is newer than any file under bin
then erase all contents of bin
and recreate with:
javac -d bin <recursive list of .java files under src>
I know this rule seems excessive, but without invoking the compiler we cannot know the extent of changes in output resulting from even a small change in a single input file.
2) if any file under bin
is newer than module.jar
then recreate module.jar
with:
jar cf module.jar -C bin .
Many thanks!
PS Responses in the vein "just use Ant/Maven/Gradle/" will not be appreciated! I know those tools offer Java compilation out-of-the-box, but they are much harder to compose and aggregate. This is why I want to experiment with a Haskell/Shake-based tool.
Writing rules which produce multiple outputs whose names cannot be statically determined can be a bit tricky. The usual approach is to find an output whose name is statically known and always need
that, or if none exists, create a fake file to use as the static output (as per ghc-make, the .result
file). In your case you have module.jar
as the ultimate output, so I would write:
"module.jar" *> \out -> do
javas <- getDirectoryFiles "" ["src//*.java"]
need javas
liftIO $ removeFiles "" ["bin//*"]
liftIO $ createDirectory "bin"
() <- cmd "javac -d bin" javas
classes <- getDirectoryFiles "" ["bin//*.class"]
need classes
cmd "jar cf" [out] "-C bin ."
There is no advantage to splitting it up into two rules, since you never depend on the .class
files (and can't really, since they are unpredictable in name), and if any source file changes then you will always rebuild module.jar
anyway. This rule has all the dependencies you mention, plus if you add/rename/delete any .java
or .class
file then it will automatically recompile, as the getDirectoryFiles
call is tracked.
If you love us? You can donate to us via Paypal or buy me a coffee so we can maintain and grow! Thank you!
Donate Us With