Friday, June 5, 2015

Positive and Negative Build Avoidance

I'm going to define "positive" build avoidance to be avoiding a rebuild when it's already built. "Negative" build avoidance is avoiding to rebuild something that has failed to build before.
Positive build avoidance has been around for quite some time, and is usually easy to implement: simply check if the target artifact exists and has been created from your current source set. This can be as naive as make's timestamp check (if it's newer, then it has to be from the current source) to more sophisticated checksum or hash signature checks.
I don't know of any system that does negative build avoidance, so I'm building one.
I happen to already have a system which does positive build avoidance by storing build artifacts using a version computed from the git tree hashes of all its source components. I recently added an artifact that gets published on every build, no matter whether it failed or succeeded: the build log. Besides impressing auditors, this actually allows me to implement negative build avoidance:
  • If all artifacts are present, positive build avoidance as usual, no need to rebuild;
  • If artifacts are missing, but the build log artifact is present, negative build avoidance occurs, and I do not even bother to attempt to redo a build that is known to fail;
  • If artifacts are missing and the build log is missing, then the build either didn't occur or crashed in the middle. In that case, schedule the build.
The nice thing about implementing negative build avoidance is that it allows me to deal with unstable build farms (and any Jenkins build farm of substantial side is bound to be unstable). Simply keep recomputing the build schedule after every pass until every build either succeeded or failed with a build log. Makes the build team look good, since all build failures are now clearly the developer's fault.