Ponderings on Jardeps


Jardeps is a library for GNU Make for compiling Java. Here are some related topics that don't qualify as part of the Jardeps manual.

Assessment of Ant

It is said that Ant is the proper tool for building Java projects. It is certainly intended to be more portable, and maybe integrates well with IDEs. And it is said that Make works poorly with Java because Java dependencies are too complex for it to handle. One could infer from that that Ant is also somehow better than Make specifically regarding dependencies. If all this is true, why am I using Make?

Let me first deal with the reasons that are specific to my situation:

  • I rarely develop on (say) Windows. Occasionally, I have developed for Windows, but only by cross-compiling. This makes build portability less of an issue for me.

  • Many of my Java projects also have C components, such as JNI, Linux kernel modules and launchers, that allow me to program beneath the Java abstraction. Ant didn't seem to buy me anything in these situations that Make didn't already offer.

What I actually want to say about Ant, specifically its <javac> task, and without using <depend>, is that it doesn't really handle Java dependencies very well either.

Suppose you have a class Foo, which references class Bar:

public class Foo {
    public static void main(String[] args) throws Exception {
        Bar.test();
    }
}
public class Bar {
    public static String test() {
        System.out.println("Test");
        return null;
    }
}

These classes compile fine, until the return type of Bar.test() changes. Foo.java does not need to be changed, but Foo must be recompiled in order to discover the new signature of the method. Foo.class originally contains a reference to Bar.test()Ljava.lang.String;, but if the return type changes to Object, Foo.class must be re-created so that it references Bar.test()Ljava.lang.Object; instead.

What does Ant do in this situation? It detects that the source file for Bar has changed, and passes it, along with any other changed source files it has detected — but not Foo.java — to a single invocation of javac.

This is certainly more efficient than trying to get Make to pick out changed files and compile them separately. However, both approaches are incorrect anyway, as they miss the need to re-compile Foo too.

Even if Ant is integrated with an IDE that searches for compile-time errors as they arise, this oversight is not detected, because there is no compile-time error. Even in more likely cases, where there would be an error, I would expect the build system to do enough work to detect it, i.e., it should submit everything necessary to the compiler, even if it's just to find that there is a compile-time error, and Ant does not do that. This requirement is important if, for whatever reason, you intend to program without an IDE.

There is a <depend> task which handles this problem, but it fails to detect classes that depend on changes to inlined constants if you're using Java 7 or earlier. However, I've just discovered that Java 8 puts a runtime-redundant reference to the class defining the constant into the referencing class; <depend> picks this up and produces the correct behaviour. Nevertheless, <depend> feels like a hacky way to fix the oversimple <javac> task, when really there should be a single, integrated task whose aim is to do things correctly from the start.

<depend> has a closure flag that makes it detect more kinds of change, but it wouldn't need it if it didn't miss any. And the flag wouldn't be optional if it didn't sometimes trigger unneccessary compilations, nor incur a significant overhead. Under what circumstances would you not turn it on to get a more internally consistent build? At this point, is it not better to just compile the whole tree on the condition that any source file changed?

I find that it's simpler to have a clean build within a source tree, a single invocation of javac that compiles everything at once. It guarantees internal consistency among the class files of the tree. Make is perfectly capable of handling that, and I wrote Jardeps to deal with three specific problems that arise from it:

  • It's easy to run javac unconditionally on every invocation of Make, but the Java code might not be the only component of the project. Changes to the other parts should not incidentally trigger recompilation of the Java component, so something is needed to detect an internal change to a source tree.

  • For sufficiently large projects, a complete rebuild could be too time-consuming. It should be possible to break the Java code up into elements that are consciously free of dependency cycles between each other, and build only the elements that change.

  • The programmer will also be conscious of dependencies between those elements, especially between one element and the interface of another. It should be possible to express those dependencies to avoid manually working out when to force re-compilation of elements that have not changed internally.

I'll mention a couple of other points in Ant's favour:

  • Its approach does mean that all classes get compiled with full annotation processing.

    Wait a minute. Is that actually true? Won't some classes get compiled implicitly when referenced by a class that has changed? You could turn implicit compilation off, but then even more work will be missed.

    Perhaps it doesn't matter, as every class will be processed on the first build (if one knows the full set of source files beforehand), and it only needs to be reprocessed if the source has changed. However, maybe the generated source files will need recompiling even if they haven't changed, just as Foo does.

    Anyway, it's no longer an advantage over Jardeps, which now can ensure that every discovered source file is annotation-processed.

  • The <javac> task could be improved or replaced, to perform the same kind of dependency checks that Jardeps does. (But does Ant support any kind of order-only dependency?)

Annotation processing and implicit classes

What is the problem with annotation processing and implicit classes? Why can't an implicit class be processed? Here is my understanding of what happens:

  1. The source files of the root classes are parsed. These become an initial set of syntax trees to be processed.

  2. The current set of syntax trees are submitted to annotation processors. These may generate new source files.

  3. The created source files are parsed, yielding a new set of syntax trees. If this set is empty, move to the next step. Otherwise, make this the current set, and go back to step 2.

  4. Complete the translation of all source files encountered so far. This also means that the source files for the implicit classes should be found, analyzed and translated.

Implicit classes are determined only from step 4, but annotation processing finishes before step 4.

Can anything be done about this? The wording of the javac manual suggests that the behaviour may change in the future (my emphasis):

To compile a set of source files, the compiler may need to implicitly load additional source files. … Such files are currently not subject to annotation processing.

Perhaps it is possible to get the names of implicit classes without first fully translating the referring class, but the subroutine that does that in the current implementation of the compiler might be too coupled to prior stages that it can't be moved without considerable effort. A new version of the compiler would have behaviour similar to this:

  1. The source files of the root classes are parsed. These become an initial set of syntax trees to be processed.

  2. Names in the current set of syntax trees are resolved to identify implicit classes. Each implicit class is parsed and added to the current set (unless it has been added in this or a previous round). This step repeats until there are no unresolved trees in the set.
  3. The current set of syntax trees are submitted to annotation processors. These may generate new source files.

  4. The created source files are parsed, yielding a new set of syntax trees. If this set is empty, move to the next step. Otherwise, make this the current set, and go back to step 2.

  5. Complete the translation of all source files encountered so far. This set of source files already includes the implicit classes.

It's also possible that the new capabilities of new versions of the compiler have increased the coupling, perhaps changing it from incidental to fundamental, making annotation processing of implicit classes impossible.

Until this is fixed, the work-around in Jardeps is simply to include any classes that must be annotation-processed in the tree's list of root classes, roots_foo. And now you can do this with roots_foo=$(found_foo), which scans for source files, so you don't have to keep adding new files to the list manually.

Using the built-in change detection of javac

javac has the ability to detect which source files need to be compiled, which you enable by including the output directory in the classpath, e.g., by javac -d classes -cp classes. Why don't I use that? Let's see how well it works by setting up a small example with three classes:

mkdir -p src classes
cat > src/Foo.java <<EOF
public class Foo {
    public static void main(String args) throws Exception {
        Bar.test();
    }
}
EOF
cat > src/Bar.java <<EOF
public class Bar {
    public static void test() {
        Baz.test();
    }
}
EOF
cat > src/Baz.java <<EOF
public class Baz {
    public static String test() {
        System.out.println("Test 1");
        return null;
    }
}
EOF

Note that Baz.test() returns a String, and Bar.test() calls it, but ignores the return value. Let's compile using the technique in question. We'll only provide the application class, and let the compiler figure the rest out:

$ javac -d classes -cp classes -sourcepath src src/Foo.java
$ ls -l classes
total 12
-rw-rw-r-- 1 simpsons simpsons 276 Aug 22 12:35 Bar.class
-rw-rw-r-- 1 simpsons simpsons 405 Aug 22 12:35 Baz.class
-rw-rw-r-- 1 simpsons simpsons 332 Aug 22 12:35 Foo.class
$ java -cp classes Foo
Test 1

That worked fine. Now make a small change to Baz, to reproduce the problem we see with Ant:

public class Baz {
    public static void test() {
        System.out.println("Test 2");
    }
}

We've also changed the message.

Recompile using the same technique, and test:

$ javac -d classes -cp classes -sourcepath src src/Foo.java
$ ls -l classes
total 12
-rw-rw-r-- 1 simpsons simpsons 276 Aug 22 12:35 Bar.class
-rw-rw-r-- 1 simpsons simpsons 405 Aug 22 12:35 Baz.class
-rw-rw-r-- 1 simpsons simpsons 332 Aug 22 12:36 Foo.class
$ java -cp classes Foo
Test 1

Only Foo was recompiled (and always is, because it is listed as a root class), but Bar was not re-compiled, because Bar.class is newer than Bar.java, and that's fine. However, Baz was also not recompiled, even though its source file is newer than its class file, and that is clearly a mistake, as the output of the test is still the old message.

Let's force it to recompile Baz too, by listing it as a root class:

$ javac -d classes -cp classes -sourcepath src src/Foo.java src/Baz.java
$ ls -l classes
total 12
-rw-rw-r-- 1 simpsons simpsons 276 Aug 22 12:35 Bar.class
-rw-rw-r-- 1 simpsons simpsons 381 Aug 22 12:45 Baz.class
-rw-rw-r-- 1 simpsons simpsons 332 Aug 22 12:45 Foo.class
$ java -cp classes Foo
Exception in thread "main" java.lang.NoSuchMethodError: Baz.test()Ljava/lang/String;
        at Bar.test(Bar.java:3)
        at Foo.main(Foo.java:3)

Okay, now Baz is recompiled, but that hasn't helped javac to detect that Bar needs to be recompiled too.

This technique, by itself at least, isn't going to work. Compiling unconditionally from the root avoids these problens:

$ javac -d classes -sourcepath src src/Foo.java
$ ls -l classes
total 12
-rw-rw-r-- 1 simpsons simpsons 252 Aug 22 12:51 Bar.class
-rw-rw-r-- 1 simpsons simpsons 381 Aug 22 12:51 Baz.class
-rw-rw-r-- 1 simpsons simpsons 332 Aug 22 12:51 Foo.class
$ java -cp classes Foo
Test 2

What about $??

make can tell you which files have changed when you build a target. $? expands to the list of changed prerequisites:

%.compiled:
	mkdir -p classes/$*
	javac -d classes/$* -sourcepath src/$* $?
	touch $@

mod1.compiled: $(mod1_sources)

mod1_sources += src/mod1/Foo.java
mod1_sources += src/mod1/Bar.java
mod1_sources += src/mod1/Baz.java

However, this has exactly the same problems as Ant, as it's basically doing the same thing:

  • It won't detect the need to recompile some classes whose source files have not changed.

  • Files will sometimes be compiled with annotation processing, and sometimes without.

Requirements for adaptation to Windows

Principally, Jardeps could run on Windows, but it uses many Unix commands, which would have to be replaced with Windows equivalents.

Unix Windows
GNU Make 3.81+ GNU Make 3.81+
BASH for loops $(foreach)
BASH if statements $(if)
printf Probably use less sophisticated echo
find
cmp -s '$1' '$2' || cp '$1' '$2'
cp '$1' '$2' copy /y/b $1 $2
rmdir rmdir /s
mkdir -p mkdir
| sort | | sort |?
| sort -u | | sort | uniq | GnuWin32
rm -f $(RM)
rm -rf rmdir /s/q
mv '$1' '$2' rename $1 $2
cd cd
cat type
grep -E grep -E GnuWin32
grep -F grep -F GnuWin32
true
touch '$1' type nul >>$1 & copy $1 +,,

It also assumes the forward slash / as the directory element separator, and the colon : as the path element separator. It's likely that these have to be replaced, too.

This will apparently do some of the work:

jardeps_colon:=:

ifeq ($(OS),Windows_NT)
$(jardeps_colon):=;
/:=\\
F:=$(subst /,\,$1)
else
$(jardeps_colon)=:
/:=/
F:=$1
endif

JARDEPS_SRCPATH=java$/source
JARDEPS_MERGEPATH=java$/merge
ifeq ($(OS),Windows_NT)
/:=\\
else
/:=/
endif

GnuWin32 could fill many gaps. It's probably easier just to use Cygwin.