Wednesday, April 18, 2018

Creating a lightweight Java microservice

In my recent post Creating a recommender microservice in Java 9, I lamented that if you build your Java app using Maven, it will include the entirety of all of the projects that your app uses—typically hundreds of Jars—making your microservice upwards of a Gb in size—not very “micro”. To solve that problem, one needs a tool that scans the code and removes the unused parts. There actually is a Maven plugin that tries to do that, called shade, but shade does not provide the control that is needed, especially if your app uses reflection somewhere within some library, which most do.

In this article I am going to show how to solve this for Java 8 and earlier. Java 9 and later support Java modules, including for the Java runtime, and that is a very different build process.

To reduce the Jar file footprint, I created a tool that I call jarcon, for Jar Consolidator. Jarcon uses a library known as the Class Dependency Analyzer (CDA), to perform the actual class dependency analysis: my tool merely wraps that functionality in a set of functions that let us call CDA from a command line, assemble just the needed classes, and write them all to a single Jar file, which we can then deploy.

Note that while Maven is the standard build tool for Java apps, I always wrap my Maven calls in a makefile. I do that because when creating a microservice, I need to call command like tools such as Docker, Compose, Kubernetes, and many other things, and make is the most general purpose tool on Unix/Linux systems. It is also language agnostic, and many of my projects are multi-language, especially when I use machine learning components.

To call jarcon to consolidate your Jar files, this is the basic syntax:
java -cp tools-classpath \
    com.cliffberg.jarcon.JarConsolidator \
    your-app-classpath \
    output-jar-name \
    jar-manifest-version \
    jar-manifest-name


Here is a sample makefile snippet that calls jarcon to consolidate all of my projectʼs Jar files into a single Jar file, containing only the classes that are actually used:

consolidate:
    java -cp $(JARCON_ROOT):$(CDA_ROOT)/lib/* \

        com.cliffberg.jarcon.JarConsolidator \
        --verbose \
        "$(IMAGEBUILDDIR)/$(APP_JAR_NAME):$(IMAGEBUILDDIR)/jars/*" \
        scaledmarkets.recommenders.mahout.UserSimilarityRecommender \
        $(ALL_JARS_NAME) \
        "1.0.0" "Cliff Berg"


The above example produces a 6.5Mb Jar file containing 3837 classes. This is in contrast to the 97.7Mb collection of 114 Jar files that would be included in the container image if jarcon were not used.

The components of the microservice container image are,
  1. Our application Jars, and the Jars used by our application.
  2. Java runtime.
  3. Base OS.
In the example above, we have compressed #1 from 97.7Mb down to 6.5Mb, but the Java runtime still consumes many tens of Mb. The OS can vary a great deal: if we use, say, Centos, we are talking about 300Mb just for the OS. If instead we use Alpine Linux, then #3 is only about 20Mb. That leaves the Java runtime. To solve that we need the Java module system, which requires Java 9 or later. Java 9 also requires some different considerations for Maven. I will leave that for a future article.