Git Version Control: Rewriting History

Skill Level: Novice

One of the biggest wishes people have is that they could go back and change things, maybe take a different path. In life this is usually impossible but when coding it’s easily done with versioning software. My favourite is the Git version control system. Git gives you the power to try new things without worrying about making destructive changes. If you work with a large codebase, collaborate with other developers, or need to keep a record of the changes made to your code, then you need to version control it. If you don’t do any of those things, then you still need to version control it!

Rewriting history

When I first started using Git, one thing that was stressed was “DO NOT REWRITE HISTORY!” Now what do we mean by not rewriting history?

When you make a commit in your repository, it gets added to the history. When you make a commit and push it from your local to a remote repository, it becomes part of the history for all the users of that repository. If you modify the commits that have been already committed to the repository, then you are rewriting shared history. This is a major mistake that could make your fellow developers’ lives a nightmare!

What could be the worst that could happen if you rewrite shared history? Well, besides being universally despised by your peers for making their lives harder, you will create major confusion for your team. If the repository they have based their work on changes, the branches and commits their work branched from might be changed or removed, and their changes may no longer be based on the same work.

However, what if you wanted to change a comment that you made with a commit? What if you branched from the develop branch and the code base has changed significantly since then? What if you had a series of 30 commits on your branch that really should be only 3-5 commits of relevant changes? The good news is that rewriting history is OK but ONLY if you don’t rewrite other peoples’ history!

The right way to change history

Rebasing, cherry picking, and amend commits allow you to revise, refine and restructure your commits before you share them with your team. This lets you keep the repository neat and tidy. The main consideration when rewriting history is to make sure the commits you share are readable, relevant and properly refactored.  

Rebasing

We can rebase branches to base our code on a different commit than where we originally branched from. This is very handy and often used to base our branch on the latest code rather than code base from when we branched. This also allows for a much tidier repository since the code now branches from the tip of the repository and can be viewed as a single chunk without any other commits in between.

In the example below, we have the “feature/testing_new_javascript_library” branch that was based off an older commit in the develop branch.  Rather than merging it in, we rebased it off the develop branch so that it is as if we had branched off the develop branch in the first place. The Git for Teamsbook has an excellent chapter on rewriting history. It’s currently available as an early release book and is expected to be released Oct 2015.

Figure 1: Branch from earlier commit
 

Figure 2: Branch rebased on develop branch

Cherry picking

Git gives us the freedom to quickly and easily create as many branches as we want. This allows us the flexibility to experiment without interfering with the established codebase. When we create branches, we may not always use them but what happens when we have a branch with a few commits that we can use somewhere else? Rather than recreating the commits, we can cherry pick the specific commits that we want to apply to our current branch and disregard the rest. The PHPStorm IDE integrates with Git so you can cherry pick right in your editor.

Figure 3: Experimental branch with commits we don't need

 

Figure 4: Branch with cherry picked commit applied to develop branch

Amend commit

Probably the most used form of rewriting history is the amend commit. We’ve all done it: we commit our code, then look at the commit message and the message is inaccurate or there’s a typo in it. What do we do now? If only there was a way to fix that quickly.

We can use the “git commit --amend" to quickly fix our last commit.  This will allow us to correct our commit message before pushing our changes to the central repository. I’m a big fan of Atlassian -- not only do they have the best GUI for Git on PC and Mac (IMHO), they also have excellent Git resources.

Responsible rewriting

Rewriting history can make your life a lot easier, as well as keep your repository neat and organized. The main thing to remember when rewriting history is to avoid disrupting other team members’ work while altering the repository. It’s not something to be fearful of, but should be approached with caution and awareness.

A comprehensive Git resource for beginners and experts alike is the “Pro Git” book. The full edition is available online for free and there is also a print edition.

Reference links:

  1. https://git-scm.com/book/en/v2
  2. https://www.safaribooksonline.com/library/view/git-for-teams/9781491911204/ch06.html from “Git for Teams”
  3. https://www.jetbrains.com/idea/help/applying-changes-from-a-specific-commit-to-other-branches-cherry-picking.html
  4. https://www.atlassian.com/git/tutorials/rewriting-history/git-commit--amend
  5. http://code.tutsplus.com/tutorials/rewriting-history-with-git-rebase--cms-23191
  6. https://git-scm.com/book/en/v2/Git-Branching-Rebasing#The-Perils-of-Rebasing