Logo Questions Linux Laravel Mysql Ubuntu Git Menu
 

Identical Java sources compile to binary differing classes

Can anyone explain how identical Java sources can end up compiling to binary differing class files?

The question arises from the following situation:

We have a fairly large application (800+ classes) which has been branched, restructured then reintegrated back into the trunk. Prior to reintegration, we merged the trunk into the branch, which is standard procedure.

The end result was a set of directories with the branch sources and a set of directories with the trunk sources. Using Beyond Compare we were able to determine that both sets of sources were identical. However, on compiling (same JDK using maven hosted in IntelliJ v11) we noticed that about a dozen or so of the class files were different.

When we decompiled the source for each pair of apparently different class files we ended up with the same java source, so in terms of the end result, it doesn't seem to matter. But why is it that just a few of the files are different?

Thanks.


Additional thought:

If maven/javac compiles files in a different sequence, might that affect the end result?

like image 339
Vicki Avatar asked Oct 01 '12 11:10

Vicki


People also ask

What is binary compatibility Java?

Binary compatibility is an old idea that involves both hardware and software. Two computers can be considered binary compatible if they can run the same software without requiring that the application be recompiled.

Is Java JAR file a binary?

Jar file contains compiled Java binary classes in the form of *. class which can be converted to readable . java class by decompiling it using some open source decompiler. The jar also has an optional META-INF/MANIFEST.

Are Java. class files binaries?

Compiled Java binary files are saved as . class files.

Are. class files binaries?

class' file, which is not a pure binary file. To run this file you need an interpreter called 'java'.


2 Answers

Assuming that the JDK versions, build tool versions, and build / compilation options are identical, I can still think of a number of possible sources of differences:

  1. Timestamps - class files may1 contain compilation timestamps. Unless you run the compilations at exactly the same times, different compilations of the same file would result different timestamps.

  2. Source filename paths - each class file includes the pathname of the source file. If you compile two trees with different pathnames the class files will contain different source pathnames.

  3. Values of imported compile-time constants - when a class A uses a compile-time constant defined in another class B (see JLS for the definition of a "compile time constant"), the value of the constant is incorporated into As class file. So if you compile A against different versions of B (with different values for the constants), the code of A is likely to be different.

  4. Differences due to identityHashcode being used in HashMap keys by the compiler could lead to differences in map iteration order in some step. This could affect .class file generation in a way that is not significant, but still shows up as a .class file difference. For example, constant pool entries could end up in a different order.

  5. Differences in signatures of external classes / methods; e.g. if you changed a dependency version in one of your POM files.

  6. Differences in the effective build classpaths might result in differences in the order in which imported classes are found. This might in turn result in non-significant differences in the order of entries in the class file's Constant Pool. This could happen due to things such as:

    • files appearing in different order in the directories of external JAR files,
    • files being compiled in different order due to the source files being in different order when your build tool iterates them2, or
    • parallelism in the build (if you have that enabled).

There is a possible workaround for the problem with file ordering: use the undocumented -XDsortfiles option as described in JDK-7003006. (Kudos to @Holger for knowing about that.)

Note that you don't normally see the actual order of files in file system directories. Commandline tools like ls and dir, and file browsers will typically sort the entries (in name or timestamp order) before displaying them.


1 - This is compiler dependent. Also, it is not guaranteed that javap will show the timestamps ... if they are present.

2 - The OS gives no guarantees that listing a directory (at the syscall level) will return the file system objects in a deterministic order ... or the same order, if you have removed and re-added files.


I should add that the first step to identifying the cause of the differences is to work out exactly what they are. You probably need (needed) to do that the hard way - by manually decoding a pair of class files to identify the places where they actually differences ... and what the differences actually mean.

like image 192
Stephen C Avatar answered Nov 15 '22 18:11

Stephen C


When you compare using beyond compare, comparision is done based on contents of the files. But in the build process just the timestamp of the source files are checked for change. So it your source file's lastmodified date changes it will be recompiled.

like image 32
basiljames Avatar answered Nov 15 '22 17:11

basiljames