Review of Maven: the definitive guide
June 19, 2009 in blog by Duchess
I had heard a lot about Maven and it sounded like a nice addition to the toolbox of a developer and thus I decided to dive into the subject. I quickly found the book “Maven: The definitive guide” by Tim O’Brien, John Casey, Brian Fox, Bruce Snyder, Jason Van Zyl, and Eric Redmond. This review/summary is based on version 0.22. The book is still a living book, so it is still under construction. On their website: http://www.sonatype.com/books/maven-book/pdf/maven-definitive-guide.pdf there is already a version 0.5 (which is apparently a later version). In the next few pages I have attempted to summarize the book, to allow you to decide for yourself whether you think the book is worth your time. I must note however that in some places I have left stuff out, as the summary would get too long. This is specifically in the areas where the book went into great detail. So if you want to know anything specific about Maven, go ahead and have a look at the website and take a look at the index to see if your subject might not be treated after all.
Introducing Apache Maven
The first chapter describes that Maven is a project management tool that relies heavily on convention over configuration. This means that a lot of things about your project do not have to be specified, such as the location of the source code. However most of these default can be customized. All these defaults together offer a common interface to building software, where a user can install a certain piece of software with a two word comman (mvn install). And while the core of Maven doesn’t do very much beside parsin a few XML documents and keeping track of a lifecycle and a few plugins, it can be extended very easily with a lot of plugins. Each of these plugins can be downloaded from a remote repository, where they are maintained centrally, so you don’t have to continually upgrade Maven to get new functionality. Everything about your project is described in the project object model, the coordinates, the license, the developers, etcetera. This model enables features like dependency management, remote repositories, reuse of build logic, tool portability, and easy searching of artifacts.
Installing and Running Maven
The next chaptergives detailed instructions on how install Maven and verify that it works. The paragraphs here are all pretty short, seeing how it is relatively easy to do. There are instructions for different operating systems, and also for upgrading and even uninstalling Maven.
Part I. Maven by Example
After these two introductory chapters Maven is “introduced by doing”. The chapters in this part are a long string of examples showing exactly how to use Maven.
A Simple Maven Project
The first example shows how to create a simple project from scratch using the Maven Archetype plugin. Guided by this example they show the default Maven file structure and explain that the project object model (aka POM) is stored in the file named pom.xml. They also give the commands that you can use to build, package and install the software. And finally they explain some of the core concepts of Maven, like plugins and goals, the lifecycle, coordinates, repositories, dependency management and site generation and reporting.
Customizing a Maven Project
The second example is a weather project, which is used to show customization. Contrary to the first example, this example is dependent on some other projects, and so these dependencies are added to the POM. The project also has some resources that need to be added to the classpath. By putting these files in the right directory (src/main/resources), no further configuration is needed, as Maven picks them up automatically. Then the application should be tested, unit tests are made and again put in the right directory (src/test/java) so no configurations is needed for the tests to be found, as testing is part of the lifecycle. The tests that are written do have some dependencies, and so the test-scoped dependencies are explained. Tests can also have resources (src/test/resources). Packaging and installing incur testing, but the lifecycle can also be made to stop after the test phase by calling mvn test. Finally the assembly plugin is demonstrated to put together a distribution that can be deployed on a server.
A Simple Web Application
The third example shows how to configure a web application. First of all the packaging tag needs to contain the value war instead of the default jar. Then they add the jetty plugin, so the application doesn’t have to be manually deployed. Thirdly a servlet is added and configured (in web.xml). To get the servlet working the servlet api is added as a dependency.
A Multi-module Project
A multimodule project consists of a parent module and several submodules. The parent module contains mainly the parent POM that references the submodules. For the fourth example they combine the weather project and the web application of the previous two chapters. The parent POM states that the packaging type is POM, defines the modules that the project consists of, and defines some setting which will be inherited by all submodules. All the submodules get an extra definition of the parent module. While building a multi-module project, the order in which the submodules are listed in the parent POM is maintained unless changes need to be made. For instance if one of the modules is dependent on the other.
Multi-module Enterprise Project
In this chapter they give an example of a small multi-module enterprise project that shows what a larger real life project might look like. They integrate Hibernate and Spring to show how this might be done. A lot of the chapter is in fact not spent on Maven at all, which shows the simplicity of Maven.
Optimizing and Refactoring POMs
In the final chapter of part I, the authors have a look at what they are left with after the last example. The first step in cleaning up a multimodule project’s POM is looking for duplication. The first duplication that is encountered is in the dependencies. This is cleared up by moving these dependencies to the dependencyManagement section in the parent POM. Then the versions of these dependencies in the submodules are removed, as they would otherwise override the dependencyManagement. The next step is to put the versions of several related dependencies into a property, so that they will never differ. After that the sibling dependencies inside the project itself are resoled by using ${project.groupId} and ${project.version} to denote the communal groupId and version.
After the dependencies, the plugins are optimized. Most complex Maven multimodule projects ten to define all versions in the top level POM. Then there is the pluginManagement section where duplicate plugins can be defined. To find used dependencies that are undeclared, the Maven Dependency plugin can be used (mvn dependency:analyze). It can also be used to find unused, declared dependencies, but this is a little trickier due to for example scoped dependencies.
Part II. Maven Reference
In this part of the book the blanks are filled in and the authors really dig into the details.
The POM
A Maven project is defined by the presence of a pom.xml file. While most of the examples in the book are geared towards Java applications, there is nothing Java-specific in the definition of a Maven POM. A POM contains four categories of description and configuration: General project information (name, URL, developers, etc.), Build settings (the behavior of the default Maven build), Build environment (profiles that can be activated for use in different environments), and POM relationships (as a project rarely stands alone). All Maven POMs implicitly extend the Super POM which defines a set of defaults shared by all projects, including the central Maven repository, the default plugin repository, the directories in the Maven Standard Directory layout, and the default versions of the core plugins. As all these defaults are specified (and therefor inherited by every POM), the simplest POM contains only the coordinates of the project (groupId, artifactId and version). Since POMs can inherit configuration from other POMs, you must always think of a Maven POM in terms of combination of the Super POM, plus any parent POMs, and finally the current project’s POM, where the lower level overrides the higher level (the Super POM being the highest level).
The dependencies described in the POM can both be internal (your own projects) and external (3rd party libraries) dependencies. Each dependency has one of five scopes: compile (default), provided (when the JDK or container provide them), runtime (required to execute and test, but not to compile), test (required for testing only) and system (native jars). Dependencies can also be declared optional, when you don’t want them showing up as transitive dependencies in the project depending on this project. This does have the consequence that they have to be included in the project that is depending on this one. Further more, it is also possible to specify a range of versions that would satisfy a given dependency. Very convenient when a certain functionality you need has been added to a specific version, but been discontinued in a later version. Transitive dependencies are dependencies of dependencies. In Maven1 you had to specify all of these explicitly, but Maven2 is able to sort these out on it’s own, by building a graph of dependencies and dealing with any conflicts and overlaps that might occur. When a conflict does occur, they can be solved by for example excluding certain transitive dependencies (explicitly) and replacing them by another dependency (implicitly). Finally the versions of the dependencies can be managed in the parent POM by dependencyManagement.
The Build Lifecycle
The Maven lifecycle is the mechanism that allows Maven to act upon the objects in the POM. Such a lifecycle consists of a sequence of named phases. There are three standard lifecycles in Maven: clean, default and site. The clean lifecycle is the simplest, it deletes the output of a build by deleting the build directory. It consists of 3 phases, pre-clean, clean and post-clean. The default lifecycle is also called the build lifecycle. It consists of 21 phases, starting with validate, running through compile, test and package and ending with deploy. The third lifecycle is the site lifecycle, which consists of the four phases pre-site, site, post-site and site-deploy. This cycle produces the project documentation and generates a site for it.
The specific goals bound to each phase default to a set of goals specific to a project’s packaging. A project with packaging jar has a different set of default goals from a project with a packaging of war, pom, plugin, ejb or ear. But there are a lot of similar goals in many of the packaging lifecycles. Most of the lifecycles have goals for managing resources, running tests, and compiling source code.
Build Profiles
Profiles allow for the ability to customize a particular build for a particular environment; profiles enable portability between different build environments. Portability is a measure of how easy it is take a particular project and build it in different environments. A Maven profile is an alternative set of configuration which set or override default values. You can even conditionally activate a profile, if the decision whether to use it or not depends on some system variable, like the presence of a Java 6 compiler. As Maven profiles can be defined in a variety of sources there is no good way to keep track of shich profiles are available. Therefor the Maven Help plugin defines a goal, active-profiles, which lists all the active profiles and where they have been defined.
Maven Assemblies
Sometimes you’ll need to create an archive or directory with a custom layout. Such custom archive are called Maven Assemblies. They can be made with the Assembly plugin. With this plugin you can create your own archive recipe, called assembly descriptor, but there are also several built-in descriptors, such as bin, jar-with-dependencies, project, and src.
Properties and Resource Filtering
Properties are a convenient method of keeping your projects consistent. Maven has some implicit properties that are available in any project: project.* (used to reference values in a POM), settings.* (used to reference values from your Maven Settings), env.* (used to reference environment variables), and System Properties (used to reference any property that can be retrieved from the System.getProperty() method). In addition to these implicit properties there is also a set of arbitrary, user-defined properties.
In addition to these defintions, you can also use resource filtering to perform variable replacement on project resources. This feature has to be enabled first. This feature combines nicely with profiles, for example to reference an external resource such as a database for different systems.
Maven and Eclipse: m2eclipse
This chapter is replaced by an entire book: http://www.sonatype.com/books/m2eclipse-book/reference/
Site Generation
Maven can be used to create a project web site to capture information which is relevant to both the end-user and the developer audience. The most simple POM would generate a website that is mostly empty and not very useful. In order to fix this problem the site descriptor will have to be customized. You can easily alter for example the default menu and the logo of the site. All the files for the site should be placed under src/site, where there are different directories for different file formats like APT, FML and XDoc. Maven uses a documentation-processing engine called Doxia which reads multiple source formats into a common document model. To write document for your project, you will need to write your content in a format which can be parsed by Doxia, like the aforementioned APT, FML and XDoc. To deploy your site you’ll use the Maven Site plugin which can take care of deploying your project’s site to a remote server using a number of methods including FTP, SCP, and DAV.
The easiest way to affect the look and feel of your project’s web site is through the project’s site.css file (under src/site/resources). To change the page structure that is rendered by default, we can configure the site plugin in our POM to use a custom page template.
Repository Management with Nexus
This chapter is replaced by an entire book: http://www.sonatype.com/books/nexus-book/reference/
Writing Plugins
99 out of 100 Maven users will never need to write a custom plugin to customize Maven; there is an abundance of configurable plugins, and unless your project has particularly unique requirements, you will have to work to find a reason to write a new plugin. A Maven Plugin is a Maven artifact which contains a plugin descriptor and one or more Mojos. A Mojo can be thought of as a goal in Maven, and every goal corresponds to a Mojo. A Maven plugin contains a road-map for Maven that tells Maven about the various Mojos and plugin configuration, called a plugin descriptor. When you are writing custom Maven plugins, you will almost never need to think about writing a plugin descriptor. The lifecycle goals bound to the maven-plugin packaging type show that the plugin:descriptor goal is bound to the generate-resources phase. This goal generates a plugin descriptor off of the annotations present in a plugin’s source code. There are three parts to a plugin descriptor: the top-level configuration of the plugin which contains elements like groupId and artifactId, the declaration of mojos, and the declaration of dependencies.
Writing Plugins in Alternative Languages
Mojos can be written in Java, or in an alternative language. Maven has support for a number of implementation languages, and this chapter shows how to create plugins in three languages: Groovy, Ant, and Ruby.
Using Maven Archetypes
An archetype is a template for a Maven project which is used by the Maven Archetype plugin to create new projects. Archetypes are useful for open source projects such as Apache Wicket or Apache Cocoon which want to present end-users with a set of baseline projects that can be used as a foundation for new applications. Archetypes can also be useful within an organization that wants to encourage standards across a series of similar and related projects. Archetypes can be used by invoking the generate goal of the Archetype plugin via the command-line or with m2eclipse. There are already many archetypes out there, including the default Maven archetypes for creating a simple project with JAR packaging and a single dependency on JUnit, a web application, and a mojo. Then there are a lot of third-party archetypes, like AppFuse, Confluence and JIRA, and Wicket.
Conclusion
All in all, it is a very comprehensive book that is well worth your time to read it. The first part is perfect for beginners as it guides you gently into the Maven domain. The second part is perfect for the more advanced user who wants to know the details on some subject.
