Videos
Hi developers,
I’m learning Git and GitHub, and I’m wondering about best practices for writing commit messages. I often write things like “I did XYZ, or I added image of cow with changes to xyz” but in a real production or work environment, what’s the recommended way to write clear, professional commit messages?
Specifically I'm asking about personal projects at first, as I assume companies will have their own rules, but how often should you be committing? Every function change? Every function written? Every x new lines of code?
And how long/descriptive should the commit be? Are there any good practices for a commit message?
I tend to commit as you propose: a commit is a logically connected change set. My commits can be anything from a one-liner to a change in all files (for example add/change a copyright notice in the source files). The reason for change need not be a full task that I am implementing, but it is usually a milestone in the task.
If I have modified something that is not related to my current commit, I tend to do an interactive add to separate out the unrelated changes, too - even when it is a whitespace tidy up.
I have found that commits that simply dump the working state to repository makes them a lot less useful: I cannot backport a bugfix to an earlier version or include a utility functionality in another branch easily if the commits are all over the place.
One alternative to this approach is using a lot of tiny commits inside a feature branch, and once the whole feature is done, do heavy history rewriting to tidy up the commits into a logical structure. But I find this approach to be a time waster.
I try and follow these practices in the order...
A commit must not fail a build. Most important!
It should be made of one logical unit of change - whether a single line/character or a whole file/class with corresponding changes in other parts of code, still following #1.
What is a logical unit of change? In terms of
git, if you can specify the changes in the commit message in least number of characters, in one sentence (without ANDs of-course), and you can not break that description further into smaller units, that I call one unit.Commit message should clearly specify the essence of the commit.
Commit message should be small, typically no greater than 80 chars. Any more elaboration should be part of the
description.
The way that Git recommends that committing be done is such that each change is an atomic, logical commit. In other words, it represents one logical change (executed well) and the entire testsuite passes both before and after the change.
One situation in which adding and committing may be done separately is when you're working on perfecting a commit. You may have an initial version of your code which works, but is messy or still has debugging code in it. You can add this change to the index using git add while continuing to refine it in your work tree. If you find that you've broken things, you can roll back to the version in the index and try again. On the other hand, if you find that a change is better, you can add it again, until you're finally ready to commit.
Another, equally valid workflow for this situation is to make multiple commits, one each time the change is an improvement over the last, and then squash them together using git rebase -i or git reset --soft. Which you want to do depends on your preferred development style; they will essentially both produce the same result.
An alternative situation you may want to use git add separately from git commit is when you want to stash only some changes. You can use git add to keep the changes you want in your working tree and then git stash --keep-index to stash the rest of them away.
A question which you didn't ask but which might be on your mind is why the two functions are separate. This is because sometimes when developing you end up with changes from multiple logical commits in your tree and you want to commit only part of them at once, making additional commits for additional logical changes. Being able to add only part of the changes at once means you can commit those, and then add more changes and commit them, and so on.
Could you please give me an example from real life experience, where add and commit does not always be done together?
When you add patches.
git add -p will interactively let you choose hunks of patch between the index and the work tree and add them to the index.
This is often done incrementally, file by file.
Then, after review, you commit.