Every change should be made on a branch

You're probably already doing this but Github pull requests make this a necessity. Branching is cheap and easy so there is no reason not too. It's also worth noting that you can create a branch with a dirty working tree so if you make some changes and then realise you have forgotten to create a new branch it's not too late. Finally, if you do add accidentally a commit straight to master and then realise there's a simple fix. Simply create a new branch from master then reset master back a commit

# at this point we have a commit on master that should be on a branch
> git checkout -b new_branch
# now our new commit is on the new branch but also still on master
> git checkout master
> git reset --hard HEAD~1
# finally the new commit is only on the new branch

Rebase whilst you're working, merge when you're done

Like most dev teams, we are quite particular about our history. We like to have merge commits when a change consists of several commits as we like being able to see the original branch for a commit. However we don't like them when the change just has a single commit as there seems little point. We found that if we merged using the Github web interface, we would always get a merge commit. Therefore we generally merge on the command line and always for a single commit change. We also find it useful to explictly use the --no-ff or --ff-only flags in order to ensure that we do or don't get a merge commit as appropriate.

In order to be able to choose when we do and don't have a merge commit, we also keep rebasing our branches as we go. This means that we deal with any conflicts early. If we waited until we merged to deal with conflicts, there would be times when we were forced to add a merge commit.

In order to make sure that Github automatically closes our pull requests for us, it is important that we update the branch on Github whenever we rebase. Otherwise Github will not recognised that the commits have been merged as the rebasing will have changed the SHAs.

Take pride in your commits

We also like to hide our sausage-making. In other words, we try and ensure that each commit that goes into the system is a consistent and coherent entity. This does not necessarily mean that we have one commit per story or task, just that it is clear what the change is and more importantly, why we are doing it. We have roughly 7 years worth of commit history for our application. Often when tracking down bugs or planning changes, it is vital to understand why code was written in a certain way. It is easy to assume that a complicated or confusing method arose because back then they didn't really know what they were doing or that we used to write rubbish like that but we wouldn't do it any more. In actual fact, there are often reasons that are not so cut and dry and in order to avoid regressions we may need to take these reasons into account. It can be extremely frustrating to find that the code you are trying to decipher was added at the same time as 20 other seemingly random chunks of code around the system and the only clue is a commit message that says Updated website. Try to ensure that the commit message includes information on the purpose and the motivation behind the commit. Imagine yourself in two years time stumbling across this commit. What are the key bits of background information that will make all the difference in the future. Can you put in any links, bug numbers or story ids that give context to your changes.

This is easy to say but in practice can be a lot harder. Sometimes you don't know exactly what will be required in order to implement a change. Other times you encounter an unforseen problem or bug that you need to fix before you can finish the task. In order to maintain a clear and coherent history, we use post-production editing techniques, including a lot of interactive rebasing to re-order, split and squash commits. Whilst the power of these tools can be jaw-dropping when you first use them, in true Spiderman fashion "with great power comes great responsibility". We always do our post production on branches prior to merging the changes into master. Once a commit is merged into master, it is untouchable. If a problem is found then another commit should be added to address the problem. I have another post that goes into these things in more detail.

Incidentally, we have found that Github is very conducive toward taking pride in individual commits simply due to the nice presentation of commits.

Post comments in the right place

Bear in mind that comments can be attached to pull requests as well as individual commits. This means that if you go back and edit a commit, you effectively create a new commit with a different SHA and so any comments attached to the previous commit will not be visible in the pull request anymore. You will also change the SHAs of the subsequent commits and so comments on those will no longer appear either. You can still access these commits using the address bar but they won't be shown as part of the pull request. This doesn't mean that it is a bad idea to post comments on commits, just that you need to be aware of what you post where. For example, if you want to point out an issue in a line of code, then posting on the commit is appropriate. If you want to just OK a pull request, then posting on the request itself may be more appropriate.

Use fixup commits then squash in

A nice way of dealing with issues raised during code reviews of pull requests is to add fixup commits to the end of a branch. This avoids problems with disappearing comments as mentioned above. It also means you can point other developers to specific commits to show that their issue has been dealt with. Before merging the pull request, these can then be merged into an appropriate parent commit so it appears that they looked like that to begin with. Sausage made!