What’s in a Good Commit?

The post is in response to a post by the same name which I read this weekend. While I agree with many of the points the author made there are a few areas where our opinions differ. I’m going to go through the different points the article makes and discuss each in turn.

Only make one change per commit

I agree with this sentiment wholeheartedly. Defining a single ‘change’ is often more difficult than it would seem, but this is good starting point in any case. I usually try to commit as often as possible – meaning when I get to a point where I have made a change that doesn’t leave the application in a broken state, I will try and commit my change. For example, I might add a new model or view, but unless there is a controller actually using these new objects we can commit the change without leaving the application in a broken state.

Make the whole change in one commit

I agree with this as well. When I’m working in Git I will often make local commits as simple placeholders and then simply combine all the temporary commits into a proper commit. The Git term for this is rebasing – which I’ve blogged about before.

If you’re using Git you can also use the –patch parameter to be able to select only certain parts of a file to add to the commit. This is useful when you have made two unrelated changes to a single file.

Document what you have changed

While I agree with the sentiment (documenting your changes in the commit message), I don’t agree with the example commit messages being used. I don’t think it’s good practice to mention the class being changed and certainly not the method! (How will this approach scale, for example?)

I usually try to shoot for language that’s somewhere between the business requirement behind the change and the actual technical implementation. If you use pure business terms it’s too difficult to figure out what part of the codebase was affected and if you only describe the technical implementation it becomes too verbose and obviously doesn’t scale beyond very small changes.

I think the middleground to shoot for is to use technical terms to describe the business requirements behind the change. For example, the following would tell me what part of the codebase changed and the thinking behind it.

Changing user model and repository to allow caching of recent conversations.

(By the way, it’s insanely difficult to think up the perfect commit message) What I’m trying to illustrate is that using technical terms such as ‘model’, ‘view’, ‘service’ and ‘repository’ in combination with business terms such as ‘user’ allows developers to easily understand which parts of the codebase are being changed without getting stuck in the details.

I think it’s important to keep in mind the audience for your message – other developers. When I’m looking at commit messages it’s usually to try and isolate why something is going wrong or to understand why a certain change was implemented. Keeping with my example commit above, if users can no longer see any conversations a quick look at the revision history would show this commit as a likely culprit. Alternatively, if I’m trying to understand why the repository class isn’t simply doing a straightforward lookup the commit would tell me the business requirement behind the code.

You could argue that some of the details are missing here – is a new caching mechanism being used? Is cache invalidation being done? Are we implementing the repository pattern for the first time? Here I would argue that it’s important to realize that commit messages are a simple and brief recording of changes being made to the codebase. Commit messages are not an alternative to code reviews and does not alleviate the need for developers to communicate with each other. If a new caching mechanism is being introduced I definitely don’t want to learn about it by looking at the revision history! Commit messages should simply allow developers to understand – in broad terms – the technical changes being made and the business requirements behind them.

Document why you made the change

Again, I agree with the sentiment, but I disagree with the details. The author argues that the commit should document why a change was made and highlights the following two risks:

The other developers do not understand why the code was written the way it was. When they change the code, they introduce problems that the original author had identified and avoided.

The other developers assume that the code was written the way it was for a Good Reason™ so it’s best left untouched. They treat it as a black box and add complex workarounds to avoid changing it. As a consequence the codebase becomes bloated and hard to understand.

The author then suggests that the reason behind the change be put into code comments or into the commit message. While these risks are real, I don’t think agree with the way the author is trying to mitigate these risks.

I always want developers to feel comfortable with changing any part of the codebase. The way to mitigate these risks are not to use comments or commit messages – rather, use tests! Automated tests are the best form of documentation – they show how the code is intended to be used and allow the implementation to be changed without fear. Even when I’m looking over one of my own commits I prefer to look at the tests to understand the change properly.

Never commit code that’s been commented out

I cannot agree more! (Althought there seems to be some disagreement about this – look at the comments on the original story)

Happy coding.