« MOTU Mach Five: The Review I Wish I Had Read | Main | Maven 2, Or: 2.0.4: A Space Oddity »

Maven 2, Or: How I Learned To Stop Worrying And Love The .pom

February 3, 2006

We've had a tough time - we've got... actually, we had... a build system which is hard to beat, in some ways. Maven was up against a strong contender.

Our old build system knows where it's being built - each developer can set up a set of properties that lets them control what's used on their own system to build product. It also knows where it's going - each target host has a set of properties that describe the structure and the layout of the target system. Configuration files were regexed for build variables and replaced as they went into their targethost-specific build directories, allowing for tailored packages specific to their runtime environment. The build system tracked which dependencies needed to be installed into /common/lib, which things needed to go into /common/classes, and which things needed to go into the WAR.

It did all this with a minimum of fuss, written over a very short time, and with bags of flexibility. It did all this in Ant.

In comes Maven 2, the new contender for king of the build system. In its fundamental form, Maven 2 attempts to be a 'packaged' build system: Slot your tasks into the build environment, and you're off. Maven can decipher and locate dependencies, build your software, test it, run code coverage, build reports, and a host of other endlessly useful features that would take lots of time and energy into your own build system. What I wasn't really prepared for was the cost.

In my honest opinion? That cost is the fast-and-loose flexibility you've grown dependent on.

Now, I'm sure it's fine for lots of people, and with simple apps. Hell, I'm sure it's fine with complicated apps that never need to 'bend the rules'. But there's no scope for rule bending in Maven: You work the way it wants you to. That's not always a bad thing - it's often a good thing to impose structure on an unstructured world. The one-size-fits-all does grate, however, and doesn't fit every situation. However, we're a determined bunch, have a big shoehorn to match the size of the chip on our shoulders, and aren't afraid to get our elbows dirty.

Here's just a few of the problems we've had to date in our first 24 hours of moving to Maven 2.

Maven 2: Not an application, a collective

The state of dependency management for Maven's own infrastructure is all over the place. Functionality mentioned in Maven 2's documentation, like support for Ant-based plugins (which, if worked, could provide a great deal of missing flexibility) are simply non-functional. They were listed as features in 2.0.1 of Maven 2 - however, they're broken in 2.0.1, 2.0.2, and probably every other version until the actual project responsible for that plugin releases its' “2.1” plugins.

The total separation of Maven from its functionality isn't the problem - if well documented, it would even be comprehensible. However, as a result of the lack of structure within Maven's own shipping processes, it's impossible to know which things you're dependent on are a part of Maven itself, a “third party” project, etc., or what the versions needed to get something working actually are. API changes to the underlying Maven system do huge amounts of damage to plugin infrastructure; in the case of the “2.1” plugins in question for doing ant plugins in maven, it prevents them from working - even though Maven's own documentation claims it works. The documentation for the product doesn't know where to draw the line; nor does the bug system, where this bug is located under the maven banner, but actually belongs to a project that technically doesn't even ship as a part of maven itself.

Even the developers aren't really sure who's responsible for fixing it; developers claimed that Maven 2.0.2 would fix the problem - but of course it couldn't do so, because the ant plugin, already version 2.0.2 when Maven was 2.0.1, won't be fixed until rebuilt on top of 2.0.2 - when it itself becomes version 2.1.

This is the most fundamental of flaws in Maven's design: Fragility. You can look at this problem alone, and know, without question, that this is one of those highly interdependent systems that just isn't going to be safe to upgrade. Upgrading the Maven core can and probably will break dependent plugins - and since each plugin is (or can be, or are but in groups - it's arbitrary) on a separate release cycle, and the lines of responsibility are at best unclear if not at worst unknown, each breaking API change will release long periods of havoc and unusability - just like the current mess, which has resulted in a headline feature in 2.0.1 never, it seems, having worked in either 2.0.1 or 2.0.2 of Maven 2.

Contrast this with Ant, which, while it also has the benefit of being simpler and more mature (read “old”, in comparison), is pretty much bulletproof. Building reliable software needs a reliable build system. Having said that, time is always under pressure, and it's often easier for us to find ways to change the way we think about our product and change our behaviour than it is for us to come up with the time to integrate all of the tools we'd like to integrate into the build system. Standing on the shoulders of these giants would be of huge benefit to us.

Poor Eclipse Integration

It works. it's about the nicest thing one can say. Any attempt, on either our MacOS-based dev machines or our Linux based ones, to change a pom results in endless reams of 2/3/06 8:59:16 AM GMT: [WARN] Unable to get resource from repository whatsonwhen-repos (http://maven.whatsonwhen.com/repository/) 2/3/06 8:59:17 AM GMT: [WARN] Unable to get resource from repository central (http://repo1.maven.org/maven2) 2/3/06 8:59:17 AM GMT: Unable to download the artifact from any repository flooding the console. It takes a long time to fail - and no amount of twiddling will make it go fast. Slow, prone to hanging, painful to use at the best of times, and ultimately offers very little in the way of integration.

Eclipse integration is limited, at the moment, to the two critical requirements of any integration of Maven into eclipse: 1) It allows you to use the .pom's calculated dependencies. Maven will go through and build what it believes, from your and other .pom descriptors of your projects and the libraries you are dependent on, the complete list of dependencies - both for compile and runtime - and provide the necessary compile-time requirements into the Eclipse project automatically. Change the pom and the dependencies, and your eclipse project is instantly up to date.

Of course, instantly is relative. On our smallest of our projects - our framework APIs, which provide no implementation, just the interfaces, core utilities, and in some cases value objects - it takes over 30 seconds to scan the pom. Pray you don't have to do it often. And, of course, the result of this scan is little more than the aforementioned list of libs in Eclipse, because the functionality that actually updates your “maven repository” - your local copy of all of the libraries you're dependent on (I don't want this to turn into a tutorial, so I'm glossing) - doesn't actually get properly updated. So you've still got to run the commandline to actually get the dependencies.

And this isn't like the Ant plugin. Don't expect to get a nice window listing all of the possible targets and letting you decide what to do. It has one other feature: It lets Eclipse build your source code. That's it. Technically, that's accomplished by the former setup of the dependencies, but it'll also set up where your source code files are located. That's not technically hard, no - but necessary nonetheless.

Don't expect to choose targets. Don't expect to 'install' anything from inside eclipse. Expect to become very, very comfortable with using maven's commandline, because that's ultimately where you're going to end up.

Add this to the performance issues and installation headaches of the subclipse subversion-eclipse integration - which, by the way, is also less wonderful than its CVS counterpart, and one might wonder why one would bother.

On the other hand, the dependency management does work. By forcing us to think about the structure of our project, and the dependencies, and more importantly which parts of our project where dependent on which libraries, we were able to capture a clear understanding, in code, of which parts of the system have dependencies, and which don't.

It's tempting to even take it further, and like Maven 2, start separating all of our own service implementations into separate maven poms - improving the granularity of our capture. On the other hand, doing so would make our builds just as fragile as Maven's own, with dependencies flooding through the system and having to step back through the build to figure out which parts of the system created dependencies on it, and how to get rid of them. Which brings us to the next problem with Maven 2: automatic dependency management.

Automatic Dependencies: How To Ruin Your Deploy In 8 Easy Steps.

The first and most important thing to know is that you shouldn't expect to reference a remote library on ibiblio and have it work. It's nice when it does, but you can't expect it. Here's just a few examples of why:

  • Xerces: Ok, so it's always been a mess, hasn't it. Which versions of what libraries go where and do what with whom. Castor's package lists a dependency on xerces/xerces 2.4.0. Unfortunately, most other poms list their dependencies on xerces/xercesImpl - left unchecked, you'll copy both 2.4.0 and 2.6.2+ into your test environment, and land in an un-nice place. Solution? Put an exclusion on xerces/xerces on anything that forcibly includes it. That, of course, means trying to figure out who the hell is doing so - but you'll wish you never had to know this in the first place. Ultimately, a projects' naming problems in its early, or its old days, will haunt you for the rest of your days.
  • XML APIs? That bugbear? That's even worse. Version 2.0.2 references 1.0.b1 as the actual version to use; there's also a 1.3 in the repository. Neither of them, of course, are the version of the XML APIs that ship with Xerces 2.7.1. And it's all too easy to end up with more than one xml-apis being pushed, despite you wishing it wouldn't, because inevitably you're going to have to store the “right” version of xml-apis for 2.7.1 in a local repository; having a later version in a local repository seems to throw it off. Solution? Exclusions. Worse yet, there's the differences between xml-apis and xmlParserAPIs. Solution? More exclusions decorated across anything that lists it as a dependency.
  • And what happened to Xerces' resolver? Oh, yeah - it's not called that. It's xml-resolver/xml-resolver. Could take you a while to figure out that the thing that always shipped with Xerces can't actually be found as a part of Xerces' own repository area. Finding out what things are called is almost as much fun as resolving their dependencies' dependencies.
  • Which brings us neatly to jaxen. There's a flag that allows libraries to list other libraries as optional dependencies - so you'd think that a system that includes plugins for jDOM, dom4j, and xom would list them as being optional, so you don't end up with all three. Fat chance. Worse yet, jdom's got the xmlapi/xerces problem. So in the end, Maven will go to download jDOM, dom4j, and xom - and all of their dependent libraries. Solution? More exclusions to go with your exclusions, forcing them all to excluded so you can manually choose what you end up deploying.
  • Then there's all the missing stuff: our own “repository of stuff there's no repository for” includes Drew Noakes' metadata extractors for EXIF parsing (just not present), JXTA (ages old on ibiblio, and with problems of its own that require a local build to make work - it's built with conflicts with the released version of jDOM because jDOM 1.0beta8 changed API contracts with 1.0 release. There's rocket science for ya), the ostermiller-utils library, slide, trove (1.1b3), whalin's memcached library in java, and a workaround for the xerces mess, our own xml-apis/xml-apis whose version matches our Xerces dependency, 2.7.1 - so that we can call reset() on an XMLParser and reuse the instance in our parser pools.

In short, it's that fundamental tradeoff that hits you at every corner with Maven2: To gain automation, you lose control. You can regain it - but only by micromanaging the automation. And since each project is responsible for its own poms, the chain of responsibility can fail at any point in the chain, leaving you with dependency soup.

The benefit you get for this headache? It gives you a framework for solving the problem. Admittedly, it's more complicated than setting a property in ant. On the other hand, doing it this way... well, I'm sure there's some kind of benefit to be had from this, if only to get the benefit of the rest of mavens' tools. The problem is, when you do it yourself, you don't have to see, normally, just how ugly the versioning/dependency problem is. You grab the latest version of the libraries you know you need, chuck them in a directory, add them to the build system, and manage it all yourself. When we ship our library dependencies, we've got almost 50 of them - not including our own code - and so there's a lot of dependencies to track and manage. When you use maven, you see the true ugliness of dependency - and the contents of your .war's WEB-INF/lib directory is what pays the price for all this 'ease of use'.

Fundamentally? It's only “easy to use” when you can leave it on automatic. Underneath, it's still a stick shift manual - and the transmission needs work.

Deploying

And then we hit the fugly part: Actually getting this stuff onto your servers.

Previously? We'd just tell it what the target host was going to be, run deploy-framework, deploy-services, deploy-webapp, and deploy-libs, and that's a complete deploy; if the targethost says its' remote, then it's built locally and copied over via SSH; if it's local, it's built locally, jars are deployed, but the war is kept exploded instead of packaged, and it's all put into the places you told it to put stuff on your machine. Got a dependency that needs to be in common/lib? No problem, deploy-libs will deploy the libraries which need to be in common/lib to that point - no need to worry they'll end up in a war file.

But how do you tell Maven to put stuff into Tomcat's common/lib? How do you make it do different things on different hosts? How do you build any kind of remotely complicated WAR, where you may not want anything and everything to end up in the war's common/lib directory?

The answer would appear, as of this morning, that you don't. We're 24 hours later, just about - and we're not yet at the stage where we've got an application deploy that “just works” without also being “just totally busted”. And to get this stuff into all the right places on the server? Well, we're still using ant - only, because there's no way for us to build a maven plugin in ant (see the first problem we ran into), we're doing it all from a standalone ant script. Even that won't save us from the fact that there's little control, if any, over the details of the building of the war.

Summary

In short - our problems have just begun. Once we're through the teething problems, we can start looking at all of the things that made us decide on Maven in the first place: its integrated support for documentation building, clover and reporting, changelog integration, testing, the Continuum continuous integration system, etc.

Those things will be the things that make this worthwhile. This, however, is not Maven's finest hour - and at this point, 24 hours in, we're still struggling to get a buildable product, even after spending weeks worth of dev time preparing for this day; at some point, we just had to take the punt and scramble the rest of the way up the mountain.

We just weren't expecting the rest of the mountain to be quite this high - and only at its summit will we be able to look down and know whether it was all worth it.

Afterword: The Next Two Weeks

Some of this stuff has become a bit clearer over the last week or so, as we've continued our implementation and tried to get stuff working inside Maven's build environment.

Things we found out:

xml-apis versioning
This is an odd one. xml-apis seems to use three different versioning techniques - one to correspond to its original versioning as a standalone jar (very old), and one representing the version of the JAXP API it's meant to support. For 2.7.1, it seems we're trying to use xml-apis 1.3.02 - updating to that version, and dropping our own repository entry for xml-apis, seems to have worked for us.
surefire
Surefire, in pertest mode, does odd stuff. Just avoid it, really - it causes more problems than it solves at this point. Specifically, it seems that providing JVM args immediately wipes out the automatic stuff that should be added to the commandline; moreover, the output ends up not going to the console, so you end up operating your test suites blind.

TrackBack

TrackBack URL for this entry:
http://www.ctoforaday.com/cgi-bin/mt/mt-tb.cgi/52

Comments (1)

cresny:

After having just experienced my own Maven2 24 I do empathize with your take. There were definitely a few too many bumps and low hanging branches on the way to really declare this thing ready for the masses. Still, I did reach the end of the first iteration (successful build and deploy of a fairly complex multi-module app), and though I find most of your criticisms generally accurate, I see it not so much as a mess (though it is a bit) but more a diamond in the rough.

For starters, I did come to appreciate the imposed lifecycle structure, but only after I got the hang of scoping dependencies and creating profiles. I think what's lost in quick-and-loose flexibility is more than made up for in it's more o-o style elegance and extensibility. And unlike your experience, the antrunner did work for me (2.02), though my beef with it loses the maven properties when it shells ant - can't the runner reinitialize these? - and now some of my custom tasks are not portable, though this is where the 'write your own plugin' part chimes in. I did had to compile the mojo-sandbox aspectj and weblogic plugins, but this was pleasantly pain-free (watching it eat it's own dogfood).

I think the EclipseM2 plugin has great potential and I'm very much looking forward to it getting past the settings.xml/proxy server issue. My workaraound is to first run mvn as an external task then use the plugin (0.0.5) for subsequent runs.

One major issue I found seems to stem from some intractability in the XDoclet camp, where apparently subtasks are statically hashed, causing name based collisions when it's embedded by maven in a process. Seems like it should be easy to fix, but it could be a big deal if you have numerous ejb and web sub-projects generating deployment descriptors. btw, multi-projects are now much, much cleaner.

Finally(!), here's the best part -- it is FAST! Our project uses compile-time aspects, and with old maven this was web-surfing time. Now it's not much slower than a simple ear and deploy. And once the above cited M2eclipse issue is solved it should really rock since there won't be the M2 startup lag.

So I'd say to anyone else who has experienced the above, just hang in there, there IS a light at the end of the tunnel (and no it's not the blinding white kind!)

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)

About This Article

This page contains an article posted on February 3, 2006 10:50 AM.

The previous post in this blog was MOTU Mach Five: The Review I Wish I Had Read.

The next post in this blog is Maven 2, Or: 2.0.4: A Space Oddity.

Many more can be found on the home page or by looking through the full article list.

www.flickr.com
gblock's items Go to gblock's photostream
Creative Commons License
This weblog is licensed under a Creative Commons License.

PS3 ID: CTOForADay
Wii: 1974 6313 6054 0208