Monday, May 14, 2012

Version Numbers Are Evil

Version numbers are a perfect example of Bikeshed. Everybody gets them, everybody will have something to say about them. Most importantly though: they hardly matter.

Some fairly famous companies like Microsoft and  Apple have been seen toying around with ideas to de-emphasize them. Hence we had code names (Longhorn, Lion...) and time stamps (Windows 95), but they are truly hard to kill (IE 9)...

Unfortunately they've been around for quite some time and are ingrained in our software engineering lore.

Back in the days where you released once a year or so, or maybe once a quarter, tracking version numbers was a minor hassle, compared to the huge size of the changes and the possible impact of a new release - promptly discouraging anyone from upgrading, which in turn promptly made fixing bugs even harder: not only did you have to fix the current version, but all the "supported" ones, including perhaps some unsupported ones if the customer was important enough...

Those days are thankfully fading. Instead, we have software as a service. Ever asked what version of gmail you're using? or Facebook? doesn't make sense, does it? Not like you have a choice...

Still, version numbers haunt many of the modern build tools and dependency management tools.

I guess there is some satisfaction in exercising positive control by updating all consumers of your toolkit with the dependency to your latest version, but in practice, it's a nightmare. Not only is there a lot of error prone labor involved, but you also encourage "mix and match", and general procrastination on bugs: "Oh, the latest version breaks my app, so I'm going to stick to the older version". "Oh, the latest fixes a security hole? Well, I hope I won't get hacked"...

The smart folks who wrote the Advanced Packaging Tool (also know as aptitude, or apt-get) realized quickly that direct dependency management could not possibly function for such a complex beast as a Linux distro. They strongly discourage explicit versions in dependencies, as shown in their many examples.

The  maven build system makes a very slight concession to the idea of version numbers being fluid via their -SNAPSHOT construct. It's unfortunately very inadequate, since there is no good way to relate to exactly which version was used once it was built.

Ivy fares slightly better: you can specify wildcards that will be resolved at build time. Still, you are stuck with a linear version space, when you really need a true build chain:


In most artifact repository systems, you can emulate this by creating branch or build specific channels or instances of a repository.
"But are you nuts? How can you know what you built against?"
You use a build number instead of a version number. The point is that it's automatically generated. You are using some continuous build system, are you? If not, get one. Jenkins is OK, TeamCity is really good, but costs money. Don't even consider testing or deploying manually built stuff.

As a compromise, append the build number to a manually maintained version number if you must, until you realize that you never really want to change the manually maintained portion unless someone prods you...
"But are you nuts? How is your app going to co-exist with my app if we depend on different base libraries?"
Build assemblies - or if you're in C/C++ land, use static linking, or, if you must, package your application so that it looks up its shared libraries in a private location.

"But are you nuts? Are you really going to force me to make compatible changes in shared libraries?"
Yes, I will. It's shared for a reason. If the interface is so crummy that you cannot derive the right functionality, create a new one that is, but don't break the existing one. Modern languages have plenty of ways to extend interfaces without breaking existing code:
  • Optional arguments with default values
  • New methods
  • Subclassing
  • Traits
  •  ...
Folks should remember that one big purpose of having a shared library is so that you can affect all consumers of the library by a single change, and don't have to go edit every application. The only way to make good on that contract is if every application maintainer stays up to date.


2 comments:

  1. Nice post. I've always struggled with people who want to spend useless hours arguing over a complex version numbering scheme, and then waste more time agonizing over whether a feature sets warrants a 1.0, .5 or .1 increment. Or that 0.9 means you provide less support than 1.0. The idea that a version number communicates any kind of scope or warranty information is ludicrous, and as you mention the perpetuation of that thinking creates upgrade aversion in customers.
    Versioning exists for one purpose and one purpose only - change management. Identifying versions deployed so you can tie back to the right code set for fixing issues, and for managing inter-dependencies. As you suggested an automated linear build numbering accomplishes this quite adequately, until you need to patch, so some sort of release number + build number becomes inevitable.
    The mistake not to make is to use the release number for versioning artifact builds - that's what the build number is for. The release version should be the target.
    So for example, R1 B1 - 9 were dev builds, B10 went to QA, B12 was RC1, B16 RC2 and finally B17 was gold release. B18 was thrown away because it was less stable than B17.
    The idea here is that the target stays the same and each build is a shot at the target. You simply promote builds to release status based on how close they were to the bulls-eye.

    ReplyDelete
  2. Thanks.

    The big challenge is to set up your artifact publishing rules to match the branching, and to organize the promotion of artifacts to avoid rebuilds.

    ReplyDelete