The Secret of Tidy Git Repositories: When Best to Merge and Rebase
I once worked at a company that had been using Subversion for source control. I don’t know how long they had done so because it was already in place when I joined. TortoiseSVN is user friendly, but I found the experience frustrating because I was used to using Git.
Soon after, the team decided to change to Git. I was happy with this: I preferred the branching mechanism. After we implemented the change, I found that a few of my teammates were unfamiliar with the concept of rebasing branches. While rebasing shouldn’t be done at every opportunity, there are times when it might be better to do so. In the last part of this topic, we took a high-level look at the differences between merging and rebasing. In this part, we’ll take a look at the pros and cons of each.
A Case for Merging
The biggest advantage of merging is simplicity. Conceptually, it’s easy to understand. When we want to update a branch, we take the branch containing the latest code and combine it into our working branch. When the feature we are working on is ready, we merge that working branch into the repository’s main branch.
This approach means that a repository’s commits are only ever added to. Once written, the commit history is never changed. All commits (regardless of branch) stay in the order that they were created. This may result in related commits being spaced apart on the commit graph. We can see this in Image 1: Commit for feature 1 is sandwiched between Commit for Feature 2 and More work on Feature 2 despite being on a different branch.
Image 1: A repository with interweaved commits on different branches
As the commit history does not change, pushing to a remote version of the same branch can be done without needing to force push. This has the advantage of being safer; there is less chance of accidental data loss.
A Case for Rebasing
Commits on the commit graph are displayed in the order that they were made. This means that commits on any given branch could be interwoven with others from other branches when displaying more than one branch. We can see this happening in Image 1. Depending on the frequency that members of a team commit code, commits on a branch could appear scattered – even with a small team. This could make the graph difficult to read and involve vertical scrolling, even with a modest number of commits. Let’s say we merged main
with feature-1
in our repository. The result would look like Image 2.
Image 2: main
has (fast forward) merged with feature-1
Rebasing a branch will reapply all commits to a new starting point. The result is that all commits on a branch will be listed together on the commit graph. By rebasing feature-2
onto main
, commits in feature-2
will be shown as a single uninterrupted block applied on the head of main
. We can see this in Image 3: all (two) commits for feature-2
are grouped together, making them easy to see. Note that while the commits have been reapplied, the timestamp of each commit hasn’t changed.
Image 3: feature-2
has been rebased onto main
In addition to grouping the commits together, feature-2
now includes all the code that was in feature-1
. The biggest advantage of rebasing is that branches can be updated to include code from other branches without having to introduce merge commits: commits in a branch will only include code changes for that feature. This means that branches could be rebased – or reapplied – to other branches more easily too. This could be important if there are separate deployments based on different branches, and you’d like a feature transferred to another environment.
Despite its advantages, the destructive nature of rebasing means that it becomes impractical when a branch is shared by more than one developer.
Combining Rebase and Merge
Once rebased, a branch still needs to be merged into the main branch. We can do this in one of two ways. In Image 3, main
is directly behind the head of feature-2
. We have seen before that we could do a fast forward merge, and doing so would mean that no ‘extra’ (merge) commits are added to the repository in the process.
However, we might want to show that commits on feature-2
make up a feature and that they belong together as a single entity. At the moment, it’s easy to see because they are on feature-2
, but we might decide to delete this branch after the merge. In situations like this we can combine rebasing and merging by explicitly creating a merge commit. By using the command git merge feature-2 --no-ff
on main
, we can choose to override the default fast forward behaviour and create a merge commit instead. The result is shown in Image 4.
Image 4: feature-2
has been merged into main
with a merge commit
A Summary
Merging and rebasing branches are two different ways to update a working branch. Each approach has its own advantages and disadvantages. We can choose to merge, rebase, or a combination of both. Merging is simple to understand and does not alter history, making it a good choice when multiple people are working on the same branch.
Rebasing can improve the readability of a repository’s commit graph by grouping related commits together. It also lets you update a branch without introducing merge commits. This can make it easier to reapply a branch onto other branches: when transferring features to other deployment environments, for example. Once rebased, it can be merged into another branch either with or without a fast forward merge. However, rebasing is destructive. This makes it less suitable when many people are using the branch.
By understanding both approaches, you can confidently decide which is most appropriate for each situation. And like you know how to keep your code clean, you now know how to keep your repositories tidy too!