Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Are there technical reasons to avoid creating highly tangled package dependencies in large Java projects?

I'm new to modern Java compilers and Virtual Machines, so I'm curious, what technical issues do large Java projects (5000+ sizable classes) encounter, during compilation and at runtime, as the gordian knot of package dependencies grows?

In large C++ projects, you can get yourself into technical trouble (all maintainability concerns aside) if you stray far from an acyclic library (or package) dependency graph in large projects.

Some examples

  • compilation can run out of memory if most of a source tree is included
  • linking can too if too many object archives are included (object archives generally correlate with packages in C++ projects)

The problem is considerably exacerbated with inline template instantiation. Modern workstations aren't equipped to compile and link a project that pulls most of 5000 sizable classes together in either phase of the build.

The Java developers I've asked do not believe technical limitations are a reason to avoid circular package dependencies (other motivations apply). Are there any?

like image 683
Chris Betti Avatar asked Jan 11 '12 21:01

Chris Betti


1 Answers

  1. The Java compiler (javac) does not compile all the classes at the same time, but rather one by one, dynamically discovering uncompiled or stale .class files.

  2. There is no linking. Instead all the .class files are packaged together in a jar file once compiled. This is basically a ZIP compression and this step isn't even required.

  3. The Java compiler is moderately simple due to simple language syntax and semantics. There isn't much metaprogramming, type inference, etc. Scala compiler, for example, is much slower because the language itself is much more complicated.

That being said I can't find any technical limitations of compiling large, tangled projects. Obviously the build time grows and once it exceeds 10 minutes it becomes a pain, but that isn't really an issue.

The real problem with tangled, circular, cross-references is source code maintainability. Mainly it is much harder to refactor code. Once the project reaches certain size (5000+ classes is probably around half million LOC) developers will try to split it into pieces. Extract libraries, modules and layers. If the dependencies are so strong, this process is close to impossible.

like image 158
Tomasz Nurkiewicz Avatar answered Sep 28 '22 00:09

Tomasz Nurkiewicz