Monday, May 27, 2013

Lets go uber!

ow do you run your non-webapp maven-based java program from command line? One might be using exec:java route and specify the main class. The only sticking point here is that the classpath will be filled with references to the user’s local repository (usually NFS mounted). I would be much more comfortable if I had all the set of jars that I would be dependent on packaged WAR-style and be ready at my disposal. In that sense, the application would be much more self-contained. It also becomes that much easy for someone to testdrive your application. Hence the concept of an “uber” jar – a unified jar that will house all classes/resources of the project along with classes/resources of its transitive dependencies.
There are two ways to achieve this:
  1. Using maven assembly plugin
    1. Add the following config to your POM… and now you have a JAR namedmoduleName-jar-with-dependencies.jar as part of the package phase
    2.     <build>
              <plugins>
                  <plugin>
                      <artifactId>maven-assembly-plugin</artifactId>
                      <version>2.4</version>
                      <configuration>
                          <descriptorRefs>
                              <descriptorRef>jar-with-dependencies</descriptorRef>
                          </descriptorRefs>
                          <archive>
                              <manifest>
                                  <mainClass>com.kilo.DriverCLI</mainClass>
                              </manifest>
                          </archive>
                      </configuration>
                      <executions>
                          <execution>
                              <id>make-assembly</id>
                              <phase>package</phase>
                              <goals>
                                  <goal>single</goal>
                              </goals>
                          </execution>
                      </executions>
                  </plugin>
              </plugins>
          </build>
    3. Now all you do is java -jar mainClassArguments and you are done!
    4. Notes:
      1. The resultant JAR has the exploded directory structure of the project as well as its transitive dependents and not really a JAR of JARs (as Java itself can’t work with JAR of JARs)
      2. If there are two different files in two different JARs but sharing the same skeleton structure like META-INF/foo/bar, then one will overwrite the other and empirically, it seems to be following the alphabetical order of dependencies.
      3. This may seem bad, but it is the same resolution that happens in a web app within the WEB-INF/lib JARs. The only difference being the absence of treatment between target/classes and target/WEB-INF/lib (where classes location was given preference over WEB-INF/lib).
      4. SPI’s defined from different JARs get truly messed up
      5. Licences from the different JARs get overwritten – but we usually don’t care very much about it – except from a legal point of view maybe?
  2. Using maven shade plugin
    1. Add the following config to your POM and you will have a JAR named moduleName-jar-with-dependencies.jar as part of the package phase
    2.     <build>
              <plugins>
                  <plugin>
                      <groupId>org.apache.maven.plugins</groupId>
                      <artifactId>maven-shade-plugin</artifactId>
                      <version>2.0</version>
                      <executions>
                          <execution>
                              <phase>package</phase>
                              <goals>
                                  <goal>shade</goal>
                              </goals>
                              <configuration>
                                  <filters>
                                      <filter>
                                          <artifact>*:*</artifact>
                                          <excludes>
                                              <exclude>META-INF/*.SF</exclude>
                                              <exclude>META-INF/*.DSA</exclude>
                                              <exclude>META-INF/*.RSA</exclude>
                                          </excludes>
                                      </filter>
                                  </filters>
                                  <transformers>
                                      <transformer
                                          implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer">
                                          <mainClass>com.kilo.DriverCLI</mainClass>
                                      </transformer>
                                  </transformers>
                                  <shadedArtifactAttached>true</shadedArtifactAttached>
                                  <shadedClassifierName>jar-with-dependencies</shadedClassifierName>
                              </configuration>
                          </execution>
                      </executions>
                  </plugin>
              </plugins>
          </build>
    3. Now all you do is java -jar mainClassArguments and you are done!
    4. Notes:
      1. There were issues with the signed jars included when working with shade plugin and hence the signature related files were filtered out. Not really sure why this doesn’t affect the assembly approach. Any experts to clear the air??
    5. Bonus Features:
      1. Customization of Manifest files to include more information
      2. First class support to include/exclude resources from artifacts using regular expressions
      3. Ability to create a minimized JAR which does a java byte code analysis to only keep the relevant JARs in the uber JAR. The savings with this seem to be huge with the resultant artifact becoming almost 15% of the original size – but can’t use it for Spring based applications where instantiation is done at runtime based on fully qualified names which can’t be determined by the shade plugin in a java way. Same for other frameworks that load classes on the fly – hence something to stay away from.
      4.  Ability to relocate classes to avoid duplicate classes on classpath – though one should not have such artifacts in the first place. The shade plugin will point out if it encounters same class in multiple artifacts.
      5. Ability to dictate that the uber jar be the only produced (and hence deployed) artifact
      6. Resource transformers to merge licence information, notices, SPI endpoints, etc.
Based on this analysis, using shade plugin sounds like the easier route and seeks to tackle the exact problem of creating the uber jar and hence may be the better maintained solutions of the two. Hope this helps!
References:

No comments:

Post a Comment