Fortified Bikesheds: July 2013

Sunday, July 21, 2013

Major Version Numbers can Die in Peace

Here's a nice bikeshed: should the major version number go into the package name or into the version string?

A very nice explanation of how this evolved: http://unix.stackexchange.com/questions/44271/why-do-package-names-contain-version-numbers

Even though it is theoretically possible to install multiple versions of the same package on the same host, in practice this adds complications. So even though rpm itself will happily install multiple versions of the same package, most higher level tools (like yum, or apt), will not behave nicely. Their upgrade procedures will actively remove all older versions whenever a newer one is installed.

The common solution is to move the major version into the package name itself. Even though this is arguably a hack, it does express the inherent quality of incompatibility when doing a major version change to a piece of software.

According to the semantic versioning spec, a major version change:

Major version X (X.y.z | X > 0) MUST be incremented if any backwards incompatible changes are introduced to the public API

A new major version will break non-updated consumers of the package. That's a big deal. In practice, you will not be able to perform a major version update without some form of transition plan, during which both the old and the new version must be available.

Note that you can make different choices. A very common choice is to maintain a policy of always preserving compatibility to the previous version. This gives every consumer the time to upgrade.

In either case, there is no need to keep the major version number in the version string. If you're a good citizen and always maintain backwards compatibility, your major version is stuck at "1" forever, not adding anything useful - and if you change the package name whenever you do break backwards compatibility, you don't need it anyways...

So, how about finding a nice funeral home for the major version number?

Friday, July 5, 2013

What's a Release?

I recently stumbled over the ITIL (Information Technology Infrastructure Library), a commercial IT certification outfit which charges 4 figure fees for a variety of "training modules", with a dozen or so required to earn you a "master practitioner" title...

Their definition of a release is:

A Release consists of the new or changed software and/or hardware required to implement approved changes. Release categories include:

Major software releases and major hardware upgrades, normally containing large amounts of new functionality, some of which may make intervening fixes to problems redundant. A major upgrade or release usually supersedes all preceding minor upgrades, releases and emergency fixes.

Minor software releases and hardware upgrades, normally containing small enhancements and fixes, some of which may have already been issued as emergency fixes. A minor upgrade or release usually supersedes all preceding emergency fixes.

Emergency software and hardware fixes, normally containing the corrections to a small number of known problems.

Releases can be divided based on the release unit into:

Delta release: a release of only that part of the software which has been changed. For example, security patches.

Full release: the entire software program is deployed—for example, a new version of an existing application.

Packaged release: a combination of many changes—for example, an operating system image which also contains specific applications.

Even though this definition does pretty much represent a consensus impression of the nature of a release, I think the definition, besides being too focused on implementation details, places too much emphasizes on "change".

Obviously, when viewing a succession of releases, it makes a lot of sense to talk about deltas and changes, but that is a consequence of producing many releases and not part of the definition of a single release.

A release is a snapshot, not a process. The question should be: "what is it that distinguishes the specific snapshot we call a release from any other snapshot taken from our software development history?"

The main point is that the release snapshot has been validated. A person or a set of people made the call and said it was good to go. That decision is recorded, and the responsibility for both the input into the decision and the decision itself can be audited.

In theory, every release should be considered as a whole. In practice, performing complete regression testing on every release is likely to be way too expensive for most shops. This is the reason for emphasizing the incremental approach in many release processes.

There is nothing wrong with making this optimization as long as everyone does it in full awareness that it really is a shortcut. Release managers need to consider the following risk factors when making this tradeoff between speed and precision:

Do I really know all the dependencies in order to evaluate the effect of a single change?
Do I really understand all the side effects?
Do all the stakeholders understand and agree with the risk assessment?

The ITIL discussion of the difference between major/minor/emergency really reflects the different ways stakeholders go about validating a release.

The ITIL discussion about packaging really means that a release is published, released out of the direct control of the stakeholders.

Incorporating these two facets results in my version of the definition of a release as:

A validated snapshot published to a point of production with a commitment by the stakeholders to not roll back.

I think this captures the essence better:

A release is a state, not a change.
A release represents a commitment by the stakeholders.
A release is published, which really means it has escaped the from the control of the stakeholders. Outsiders will see what is, and there is nothing the stakeholders can do about it.

To be fair, the ITIL definition isn't really that far off: replace "approved set of changes" with "approved set of features", and we're pretty much in agreement. But the tone makes the music, and the emphasis on state instead of change forces the discussion to always be about the whole product, and separates the "what we want" from the "how do we get it".

Story So Far

The goal of the first batch of posts is to describe a software build and release process which assumes that builds are not necessarily reproducible and expensive to perform, and also assumes a large number of independent development teams all working on some grand piece of software.

Good release management starts at the source, so the first few postings deal about source code control and change management, and how to mine the change data correctly.

Once we can reliably build stuff, we need to manage the build products, the artifacts, so they can be reused, tested and released.