Branching and merging will generate a revision graph, which looks a little bit like the diagram on the right. Generating release notes is essentially locating all the revisions contributing to the new release which do not contribute to the old release. This is done by first traversing the revision graph from the revision of the old release, marking all revisions as you recurse through all the ancestors. Then you perform the same traversal starting from the new release, but stop the traversal when you hit on a marked revision.
The mercurial distributed version control system implements this elegantly in a single command:
hg log --prune OldRelease --follow --rev NewRelease:0
Basically the same method is used for revision control systems that use explicit branches instead of an implicit revision graph. Even though every branch has a linear revision history, you need to traverse all the merge arrows stemming from other branches and mark the revisions, much the same way as you would in a revision graph.
The problem is that some very popular revision control systems don't even classify merges as something special. Subversion, for example, has no first class representation of a merge. This is partly because subversion allows you great liberty in selecting revision ranges to merge, and also allows you to squash multiple merges into the same revision. You therefore need to rely on rules and conventions to draw the merge arcs in your revision graph. As people make mistakes or bypass the conventions on occasion, those arcs may be incorrect. resulting in incorrect traversals and therefore incorrect release notes.
Perforce, which for a long time was the darling of the "simple is best" design school, doesn't have anything built in for generating release notes either. At least they do represent merges as a first class object, and their knowledge base recommends implementing the "traverse and mark" process using this script:
Note that on a large repository with a lot of history, this can run for an hour or so. Note also that it's the "-i" flag which causes the recursion through the "integrated" changes, and that the following naive invocation would fail:p4 changes -l -i //depot/main/p4/...@OldRelease > FILE1 p4 changes -l -i //depot/main/p4/...@NewRelease > FILE2 diff FILE1 FILE2 > CHANGES
p4 changes -l -i //depot/main/p4/...@OldRelease,NewRelease \ > CHANGES
Another popular revision control system creates a different problem: the actual revision graph may change between releases. Yep, "git" is a very powerful system with a lot of "cool" attached to it, but from a release management perspective, it is downright scary, as you can use the "interactive rebase" feature to re-arrange the revision history in many ways.
There is nothing wrong with developers streamlining and cleaning up their revision history prior to getting their repos pulled into an upstream location, but retroactively editing the change history in an authoritative, shared repository is something to be avoided, even if it means the occasional exposure of embarrassing mistakes.
This is why many shops actually do not use the revision control system to generate release notes. Instead, they will rely on a combination of spreadsheets and issue or work tracking systems. That's quite sad, since this obscures what code changes are effectively included in a build.
As I'll be expanding on the use of build artifacts as precious objects to be tracked, it will become quite important to always know exactly what went into them. Subsequent posts will explore some techniques to incrementally build this information and attach it to the artifact metadata.
Git does have by far the simplest way to compare two revisions:
ReplyDelete% git log old..new
It will actually do all tree pruning described for the mercurial case right there.