Why maven drives me absolutely batty

May 13, 2009

Although my maven bitching has been mostly snarky, I have come to truly believe it is the wrong tool for a growing enterprise and, like centralized version control, will lead to a situation where tools dictate process (and design).

But, what is maven actually good at?

  • Maven is great for getting started -- you don't have author an ant file (or copy one from an existing project)
  • Maven is great for enforcing a standard project structure -- if you always use maven, your projects always look the same
This is about where it ends for me; everything else maven does - manage dependencies, automated process, etc., is done much better and much more quickly by other technology. It's pretty amazing that someone can make a tool worse than ant, but maven is surely it

Dependency management is not a build step

Maven is the equivalent of doing a sudo gem update everytime you call rake, or doing a sudo yum update before running make. That's just insane. While automated dependency management is a key feature of a sophisticated development process, this is a separate process from developing my application.

Maven's configuration is incredibly verbose

It requires 36 lines of human-readable XML to have my webapp run during integration tests. Thirty Six! It requires six lines just to state a dependency. Examining a maven file and tying to figure out where you are in its insane hierarchy is quite difficult. It's been pretty well-established outside the Java community that XML is horrible configuration file format; formats like YAML have a higher signal to noise ration, and using (gasp) actual scripting language code can be even more compact (and readable and maintainable).

The jars you use are at the mercy of Maven

If you want to use a third-party library, and maven doesn't provide it (or doesn't provide the version you need), you have to set up your own maven repo. You then have to include that repo in your pom file, or in every single developer's local maven settings. If you secure your repo? More XML configuration (and, until the most recent version, you had to have your password in cleartext...in a version 2 application). The fallout here is that you will tend to stick with the versions available publicly, and we see how well that worked out for Debian.

Modifying default behavior is very difficult

Since maven is essentially a very, very high-level abstraction, you are the mercy of the plugin developers as to what you can do. For example, it is not possible to run your integration tests through Cobertura. The plugin developers didn't provide this and there's no way to do it without some major hacking of your test code organization and pom file. This is bending your process to fit a tool's shortcoming. This is limitation designed into maven. This is fundamentally different that "opinionated software" like Rails; Rails doesn't punish you so harshly for wanting to tweak things; maven makes it very difficult (or impossible). There was no thought given in Maven's design to using non-default behavior.

Extending Maven requires learning a plugin API

While you can throw in random Ant code into maven, the only way to create re-usable functionality is to learn a complex plugin API. Granted, this isn't complex like J2EE is complex, but for scripting a build, it's rather ludicrous.

Maven is hard to understand

I would be willing to bet that every one of my gripes is addressed through some crazy incantation. But that's not good enough. The combined experience of the 7 developers at my company is about 70 years and not one of us can explain maven's phases, identify the available targets, or successfully add new functionality for a pom without at least an hour on the net and maven's documentation.

A great example is the release plugin. All five developers here that have used it go through the same cycle of having no idea what it's doing, having it fail with a baffling error message, starting over and finally figuring out the one environment tweak that makes it work. At the end of this journey each one (myself included) has realized all this is a HUGE wrapper around scp and a few svn commands. Running two commands to do a source code tag and artifact copy shouldn't be this difficult.

Maven's command line output is a straight-up lie

[INFO] ------------------------------------------------------------------------
[ERROR] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Compilation failure
"Compilation failure", but it's own definition is a failure and therefore an error (not an informational message). Further, most build failures do not exit with nonzero. This makes maven completely unscriptable.

Maven doesn't solve the problems of make

Ant's whole reason for being is "tabs are evil", and that tells you something. While maven's description of itself is a complete fabrication, it at least has its heart in the right place. However, it STILL fails to solve make's shortcomings wrt to java:

  • Maven doesn't recompile the java classes that are truly out-of-date
  • Maven recompiles java classes that are not out-of-date
  • Maven doesn't allow for sophisiticated behavior through scripting
  • Maven replaces arcane magic symbols with arcane magic nested XML (e.g. pom files aren't more readable than a Makefile)

Maven is slow

My test/debug cycle is around a minute. It should be 5 seconds (and it shouldn't require an IDE).

Conclusion

Apache's Ivy + Ant is probably a better environment than maven for getting things done; a bit of up-front work is required, but it's not an ongoing cost, and maintenance is much simpler and more straightforward. Tools like Buildr and Raven seem promising, but it might be like discussing the best braking system for a horse-drawn carriage; utterly futile and irrelevant.

Git Workflow with SVN

April 28, 2009

The best way to get started with Git and have a better experience at work if you have to use SVN is to use git svn as a client to Subversion. You can take advantage of Git's awesomeness while not requiring your team or infrastructure to change immediately.

Setup

git svn clone -t tags -T trunk -b branches svn+ssh://your.svn.com/path/to/svn/root (This may take a while for a large or old svn repo)

Working on Trunk

The initial clone should leave you on git's master branch, which is connected to svn's trunk.
  1. git svn rebase # Optional: only if you want to get work from svn; you don't have to
  2. Hack some code
  3. git add any new files you created.txt
  4. git commit -a
  5. Repeat from step 2 until done

Sharing Your Changes

You will rebase your changes against's SVN's (this means git will pretend you made all your changes from SVN's current HEAD, not the HEAD you started with [you do this to avoid conflicts and merges, which SVN cannot handle]).
  1. git svn rebase
  2. git svn dcommit

If you got Conflicts

  1. Git will tell you about them, so go and resolve them
  2. For each file you had to resolve, git add the_filename
  3. git rebase --continue
  4. Repeat until done

Working with SVN's branches

Suppose you need to do some work on a branch called 1.3.x in your SVN repo:
  1. git svn fetch # This updates your local copy of remote branches
  2. git checkout 1.3.x# This checks out a remote branch, which you shouldn't work directly on
  3. git checkout -b 1.3.x-branch # This creates a local branch you can work on, based on the remote 1.3.x branch
  4. Hack some code
  5. git add and git commit -a as needed
  6. Follow same procedure as above for Sharing Your Changes. Git will send your changes to the 1.3.x branch in SVN and not the trunk

Merging the Changes You Made

Due to the way git interacts with SVN, you shouldn't automatically just merge your branch work onto the trunk. This may create strange histories in SVN.

So What?

So, this isn't buying you much more than you get with SVN. Yes, when you git checkout 1.3.x-branch it's lightning fast, and you can work offline. Here's a few things that happen to me all the time that would be difficult or impossible to do without Git.

Gotta Fix a Bug Real Quicklike

You are in the middle of working on a new feature and you need to to push out a bugfix in production code. Your in-development code can't be checked into trunk:
  1. git stash
  2. git checkout production-branch-name
  3. git checkout -b bugfix-xyz
  4. Fix bugs
  5. git commit -a
  6. git svn dcommit
  7. git checkout master
  8. git stash apply
You are now back where you started, without a fake revision just to hold your code and you didn't have to go checkout the branch elsewhere.

Can't commit to SVN due to a release

Often, teams restrict commit access to SVN while a release is being prepared. If the team is releasing version 1.5 and I'm working on 1.6 features, there can be some period of time where I'm not supposed to commit, because the 1.5 release is being prepared and under feature freeze.
  1. git commit -a
  2. Continuing working
When feature freeze is over, then I'll git svn dcommit to send my changes to the SVN server

Blocked on Feature X, Want to work on Feature Y

This happens to me quite frequently: I'm slated to work on a few features that aren't interdependent. I start hacking away on Feature X and hit a roadblock and can't continue working. I've got a half-implemented feature and I can't make any forward motion until a meeting next week. Feature Y, on the other hand, is ready to go. This requires some planning ahead:
  1. git checkout master
  2. git checkout -b feature-X
  3. Work on Feature X
  4. git commit -a etc. as I work
  5. Get blocked; meeting next week. D'oh!
  6. git checkout master
  7. git checkout -b feature-Y
  8. Work on Feature Y
At this point, X and Y are on two local branches and I can switch back and forth as needed. Don't underestimate how powerful this is, especially when you have certain features that are priorities, but can become blocked frequently. I can now easily put aside Feature Y once I have my meeting and start back up on Feature X. When I'm done, I git merge everything back to master and dcommit to SVN.

Type your log message, save it, realize you forgot to reference a bug ticket #

You have a bug tracker set up that links tickets and revisions; all you have to do is put the ticket # in your log message. It's a nice feature, but I forget to do it frequently. As long as you haven't done git svn dcommit, you can fix this:
  1. git commit --amend
Your editor will pop up and you can change the log message! Awesome.

Advanced Stuff

Once you get used to this, you will feel more comfortable doing some more advanced things.

Topic Branches

The most obviously beneficial was touched on above, but it boils down to: make every new feature on its own branch. This means you never work on master and you never work on an SVN branch. Those are only for assembling what you will send to SVN. This gives incredible flexibility to work on code when its convenient and not worry about checking in bad things. Git calls this topic branches.

Save your Experiments

If you do everything on a branch, you don't have to delete your work, ever. You can go back and revisit experiments, or work on low-priority features over a long period of time with all the advantages of version control, but without the baggage of remote branches you have to share with the world.

Cherry Pick

With Git, you typically commit frequently and you restrict the scope of each revision. A commit in git is more like a checkpoint, and a push in Git is more like a commit in SVN. So, commit in git like crazy. What this lets you do is move diffs around. On several occasions, I've had some code on a branch that I needed to use, but didn't want to pull in the entire branch. git cherry-pick lets me do that.

Mindlessly Backup Your Repo

  1. ssh your_user@some.other.box.com
  2. mkdir awesome_project
  3. cd awesome_project
  4. git init
  5. exit
  6. git remote add other-box your_user@some.other.box.com:/home/chrisb/awesome_project
  7. git push --all other-box
  8. echo "git push --force --all other-box" > .git/hooks/post-commit && chmod +x .git/hooks/post-commit
You now will back up your repository on every commit to the other box. Or, use GitHub!

REST Compliance Officer

March 17, 2009

With regard to this blog on REST compliance

Me: The Gliffy API is RESTFul
REST Compliance Officer: Does a "PUT" update the data at the given URL?
Me: Yes.
RCO: Trick Question! It's "URI". Is the only way to create a new resource done with a "POST"?
Me: Yes.
RCO: Is there exactly one endpoint, from which any and all resource locators are discoverable?
Me: Um, no, that puts undue burden on the client libraries, and over-complicates what we were trying to accomp....
RCO: YOU ARE NOT RESTFUL! READ FIELDING'S DISSERTATION, THE HTTP SPEC AND IMPLEMENT AN RFC-COMPLIANT URI PARSER IN THREE DIFFERENT LANGUAGES. NEXT!

Thank GODS that REST doesn't have a spec. If it did, it would still be in development.


P.S. If you are going to coin a term and you want to bitch about it being misused, maybe calling it a "style" isn't the best idea.

Java Annotations - Java's love of configuration over convention

March 11, 2009

In the beginning, EJB was a bloated mess of XML configuration files that allowed some sort of ultimate flexibility that absolutely no one needed nor cared about. And it sucked. So developers started using conventions to keep track of the four classes required to make a remote method call, and XDoclet was created to automate the creation of the XML configuration files. And it sucked less. Following in EJB's footsteps, Hibernate did the same thing. And XDoclet followed. And it still sucked.

So, annotations were created to essentially formalize what XDoclet was doing, instead of considering how horribly broken the implementation of J2EE or Hibernate was. And now that we have annotations, the "implementation pattern" of "ultimate flexibility through annotations" has made its way into numerous Java frameworks, such as JAX-RS and JPA.

Regarding JPA:

@Id
@GeneratedValue
@Column(name="person_id")
public int getPersonId() { return personId; }
This is not a significant improvement over XDoclet; the only benefit is if you mistype "GeneratedValue", the compiler will catch it. I shouldn't have to type "GeneratedValue" in the first place. Unless I'm doing something non-standard. Which I almost never do.

I have a Person class with a getPersonId method. Can JPA just assume that it maps to the PERSON table, and the PERSON_ID, respectively. Further, couldn't it figure out that it's the auto-generated primary key since the schema says primary key auto increment. All the information is there and available to the framework to figure this out.

The same goes for EJB. I have a class named FooStatelessBean. How about we assume it's a stateless session bean, and it's interface is defined by its public methods? It can then provide FooRemote and FooLocal for me, and I don't need to configure anything or keep three classes in sync.

Just because Java doesn't have all the Ruby dynamic magic doesn't mean we can't make things easy. In reading Surya Suravarapu’s blog post about CRUD via JAX-RS I can't help wondering why it takes so much code to call a few methods on a class?

Did the designers of JAX-RS not even look at how Rails does things? I get a PUT to the url /customers/45. We should default to calling put(45) on the class CustomersResource. Only if I want to obfuscate what's going (e.g. by having FooBar.frobnosticate() handle the request) should I be required to provide configuration.

Even in Surya's example code, he's following some conventions: His resource class is suffixed with Resource and his POST method is prefixed add. This should be baked into the spec. It's like EJB all over again with the common conventions that aren't supported by the framework because of too much useless flexibilty.

Supporting convention over configuration is easy in Java. In just a few hours, I had a tiny web framework that proves it1. It wouldn't take much more effort to allow the default behavior to be overridden, but, unlike JAX-RS, EJB, or even the Servlet spec itself, it doesn't punish developers who follow conventions. It makes their lives easier and thus encourages good behavior.

So, the point of all this is that annotations encourage bad framework design; unnecessary configuration is a major part of many Java frameworks and specs. And I have no idea why.


1it unfortunately breaks down at the UI layer, due to a statically typed and compiled language not being a great choice for implementing web UIs, but that's another issue.

Git, GitHub, forking: the new hotness

February 05, 2009

While working on my Gliffy Ruby Client, I decided I wanted a better way to describe the command line interface. Finding nothing that was any good, I whipped up GLI and refactored my Giffy command line client to use it. While doing that, I finally got annoyed at technoweenie's version of rest-client, and also noticed that the original author's had totally changed interfaces. So, clicked the nice "Fork" button on GitHub to get my own copy and fixed the issues. But that's not the cool part. The cool part is that I can change my Gliffy gem to depend on my rest-client implementation and, viola! No special instructions, no hacks, no nothing. This is a really cool thing that would be difficult with Subversion, impossible without RubyGems, and downright painful without GitHub.

Execute on your ideas now; forget secrecy, forget tweaking

January 22, 2009

A couple interesting things happened yesterday. I attended my company's annual meeting and watched the season premiere of Lost. At my company's annual meeting, we went over lots of exciting things, but there was some concern over our use of Google Apps for our email. Mainly, that they could glean our IP from reading our email and, should they choose to enter our market, gain an unfair advantage. Meanwhile on Lost, the writers actually gave us some insight into the time-travel elements of the show, describing several aspects of time travel that are not typically used in your average time-travel story. So, what have these two things to do with each other? I'd been noodling with a short story centered around time travel, and the type of time travel I was going to explore is very similar to what was described on Lost. Close enough that my story would come off as a bit less original than it would have 3 months ago. Even if my idea isn't that original (which ones really are?) it's a bit frustrating to see your idea developed (and deployed) by someone else independently. So, again, what have these to do with each other? They demonstrate the reality of (and difference between) coming up with an idea and actually doing something with it. Essentially, and idea, in and of itself, is not particularly valuable. It's what you do with it that really counts. If Google were to steal my company's IP by sniffing our email, I doubt it would have much effect on our ultimate success. Outside of stealing our code or data outright, our idea isn't something that's hard to come up with. We just happened to come up with it and execute on it first. Anyone getting into the game now is necessarily behind us. Could someone lap us? Certainly. Is their ability to do so in any way dependent on know our secret ideas? I seriously doubt it. So, sitting on ideas is a waste of time. Trying to hide an idea either for security or for fear of "unleashing" it in an underdeveloped state is counter-productive. Someone else has your idea. Guaranteed. And it's likely they are developing it. So, you should be developing it too, and hopefully releasing it to the world, rather than worry about who's stealing it, or who came up with it first. The first to market reaps the rewards.

Command line interface for Gliffy

January 14, 2009

My command line interface for Gliffy is relatively complete. It works pretty well, though the error handling isn't very clean. It's written in Ruby (RDoc here) and can be used as a Ruby client for Gliffy.

I decided on Ruby since that would be the most fun and didn't require learning a new programming language. I initially tried to make an ActiveRecord-style domain-based interface, but it was just too difficult and it was hard to see the real benefit. At the end of the day, I think integrating Gliffy into another application is a relatively simple thing, and a procedural interface would probably be the easiest way to do that. So, I modeled it after the PHP client library, more or less.

The command line interface uses the Ruby client library and provides just the basic functions I need:

> gliffy ls
321654 Some Diagram
987654 Some Other Diagram
> gliffy edit 321654
# Takes you to your browser to edit the diagram
I live on the command line, so this is much more expedient than logging into Gliffy and navigating the UI to edit a diagram.

I'm already feeling like providing access to the folders via the command line would be helpful (they are exposed in the Ruby client of course). Not sure how much the API will ultimately change (it's in private beta now), but hopefully not too much.

GitHub does it again; another killer feature

December 18, 2008

GitHub Pages (explained here) is yet another awesome feature of GitHub. You can publish, via git, arbitrary web content (even piping it through Jekyll for Textile markup and syntax highlighting). They have been keeping a tremendous momentum of late; introducing new features on a regular basis. I hope they keep it up. GitHub is, IMO, crushing SourceForge and Google Code in terms of simplicity, ease-of-use, and overall functionality.

Gliffy API private beta: what should I do?

December 12, 2008

Gliffy hooked me up with access to the private beta of their API (which I helped design and implement). I create a PHP client and experimental MediaWiki plugin to validate the API while working for them, and now I want to get something else going in my spare time.

My first thought was to make a Ruby client, because I think it would be fun and relatively easy. But, I have to admit that a Wordpress plugin would be more useful to me personally. That being said, A Trac extension would be useful at work, since we are using Trac (which is python based, and I can't say I'm too interestedin Python at the moment). I think if GitHub allowed git access to project wikis, it would be cool to allow easier integration of Gliffy diagrams to GitHub project wikis.

At any rate, I don't have tons of time outside of work, so I want it to be something easily achievable, and also something Chris and Clint are not likely to work on themselves....

Why underscores might be better than camel case

December 10, 2008

So, the "Ruby way" is to use underscores to delimit most identifiers, e.g. "add_months_to_date", as opposed to the Java camel-case way of "addMonthsToDate". This was initially something that irked me about Ruby, mostly because typing an underscore is kindof a pain (shift with the left hand and pinky with the other).

Now that I've started working, I've been reading a lot of code and realizing that code is more often read than written. Ultimately, camel case is a just lot harder to read (especially if you create meaningful method names like myself and my co-workers seem to do).

It's pretty hard to defend:

Date calculatePersonDataUsageHistoryStartDate() {}
as more readable than:
def calculate_person_data_usage_history_start_date()
end
The underscores are like spaces, making the identifier a lot more readable. Of course, both are more readable than:
// Calculates the start date of the

// person's data usage history

time_ prsn_dt_uhst_st_dt(){}

This would never fly with Java (and, honestly, look a bit weird), but I'm no longer gonna curse the Ruby convention.