Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

How to re-use compiled sources in different machines

Tags:

scala

sbt

To speed up our development workflow we split the tests and run each part on multiple agents in parallel. However, compiling test sources seem to take most of the time for the testing steps.

To avoid this, we pre-compile the tests using sbt test:compile and build a docker image with compiled targets.

Later, this image is used in each agent to run the tests. However, it seems to recompile the tests and application sources even though the compiled classes exists.

Is there a way to make sbt use existing compiled targets?

Update: To give more context

The question strictly relates to scala and sbt (hence the sbt tag).

Our CI process is broken down in to multiple phases. Its roughly something like this.

  • stage 1: Use SBT to compile Scala project into java bitecode using sbt compile We compile the test sources in the same test using sbt test:compile The targes are bundled in a docker image and pushed to the remote repository,

  • stage 2: We use multiple agents to split and run tests in parallel. The tests run from the built docker image, so the environment is the same. However, running sbt test causes the project to recompile even through the compiled bitecode exists.

To make this clear, I basically want to compile on one machine and run the compiled test sources in another without re-compiling

Update

I don't think https://stackoverflow.com/a/37440714/8261 is the same problem because unlike it, I don't mount volumes or build on the host machine. Everything is compiled and run within docker but in two build stages. The file modified times and paths are retained the same because of this.

The debug output has something like this

Initial source changes: 
    removed:Set()
    added: Set()
    modified: Set()
Invalidated products: Set(/app/target/scala-2.12/classes/Class1.class, /app/target/scala-2.12/classes/graph/Class2.class, ...)
External API changes: API Changes: Set()
Modified binary dependencies: Set()
Initial directly invalidated classes: Set()

Sources indirectly invalidated by:
    product: Set(/app/Class4.scala, /app/Class5.scala, ...)
    binary dep: Set()
    external source: Set()
All initially invalidated classes: Set()
All initially invalidated sources:Set(/app/Class4.scala, /app/Class5.scala, ...)
Recompiling all 304 sources: invalidated sources (266) exceeded 50.0% of all sources
Compiling 302 Scala sources and 2 Java sources to /app/target/scala-2.12/classes ...

It has no Initial source changes, but products are invalidated.

Update: Minimal project to reproduce

I created a minimal sbt project to reproduce the issue. https://github.com/pulasthibandara/sbt-docker-recomplile

As you can see, nothing changes between the build stages, other than running in the second stage in a new step (new container).

like image 811
Pulasthi Bandara Avatar asked Jan 07 '19 03:01

Pulasthi Bandara


2 Answers

While https://stackoverflow.com/a/37440714/8261 pointed at the right direction, the underlying issue and the solution for this was different.

Issue

SBT seems to recompile everything when it's run on different stages of a docker build. This is because docker compresses images created in each stage, which strips out the millisecond portion of the lastModifiedDate from sources.

SBT depends on lastModifiedDate when determining if sources have changed, and since its different (the milliseconds part) the build triggers a full recompilation.

Solution

  • Java 8: Setting -Dsbt.io.jdktimestamps=true when running SBT as recommended in https://github.com/sbt/sbt/issues/4168#issuecomment-417655678 to workaround this issue.

  • Newer: Follow recomendation in https://github.com/sbt/sbt/issues/4168#issuecomment-417658294

I solved the issue by setting SBT_OPTS env variable in the docker file like

ENV SBT_OPTS="${SBT_OPTS} -Dsbt.io.jdktimestamps=true"

The test project has been updated with this workaround.

like image 92
Pulasthi Bandara Avatar answered Oct 13 '22 23:10

Pulasthi Bandara


Using SBT:

I think there is already an answer to this here: https://stackoverflow.com/a/37440714/8261

It looks tricky to get exactly right. Good luck!

Avoiding SBT:

If the above approach is too difficult (i.e. getting sbt test to consider that your test classes do not need re-compiling), you could instead avoid using sbt but instead run your test suite using java directly.

If you can get sbt to log the java command that it is using to run your test suite (e.g. using debug logging), then you could run that command on your test runner agents directly, which would completely preclude sbt re-compiling things.

(You might need to write the java command into a script file, if the classpath is too long to pass as a command-line argument in your shell. I have previously had to do that for a large project.)

This would be a much hackier approach that the one above, but might be quicker to get working.

like image 1
Rich Avatar answered Oct 13 '22 22:10

Rich