Introduction to rewriting Git history
I hope you had a chance to play the old Metroid games at least once. If not, find an SNES emulator (I promise it will be more fun than this blog post), or just imagine a 2D side-scroller with a character who shoots aliens and finds power-ups for her spacesuit (I could never figure out why the aliens had power-ups for a space suit they can’t wear). During the occasional breaks in fighting, there would be a save point. It was a relief. That last section was hard, but it was over now, and I never had to do it again. I could always get back to the save point. Git commits are like that: they make a point you can always get back to. And, unlike Metroid, you can commit as often as you like. It’s no wonder there are so many small commits:
$ git commit -a -m'first draft of fizzywigg'
$ git commit -a -m'fixes'
$ git commit -a -m'more fixes'
$ git commit -a -m'wip'
...
I take it for granted now, but making lots of little commits was one of the features that helped make Git so popular. Subversion, for instance, had a central server, and devs would typically commit to trunk
which was shared with everyone. So we all held our commits until the code was stable.
But, while small commits are great, it’s not very helpful to look back in the history and find hundreds of 2-line commits with monosyllabic commit messages where some portion turn out to be an unreleased development dead-end. Yes, merge commits do wonders for grouping these commits into logical chunks, but I think spending a few minutes to clean up the history before you merge makes for a history that is much easier to understand later.
Rewriting Git’s history is not hard, but it’s not obvious either. If a reviewer says, “can you squash these commits?” it isn’t clear to the uninitiated that this means git rebase -i
. So this post introduces the commands you need to produce a clean commit history that will be pleasant to use later. For the examples, I assume you’re working on a feature branch that will be merged into a branch called main
.
This will be most helpful if you experiment in your own feature branch. Nothing here is truly destructive, but some commands are hard to undo. So, if it helps you experiment with confidence, make a new branch first:
git switch -c tmp-branch
Play around on that temp branch, and if you screw it up, nothing is lost, just start over:
git reset --hard <original-branch-name>
You can see what code changed by diffing the two branches. If you like the result, run:
$ git switch <original-branch-name>
$ git reset --hard tmp-branch
git push --force
The best time to rewrite history is before you push your feature branch. If you try to change any commits later plain git push
won’t work:
$ git push
To git:pboyd/repo.git
! [rejected] goof -> goof (non-fast-forward)
error: failed to push some refs to 'git:pboyd/repo.git'
You can still push it, but you’ll need to add -f
(or --force
):
git push -f
You may have been told not to force-push a branch. --force
has earned its reputation, because it resets the remote branch to whatever is on your local copy, and you can delete someone else’s code if you aren’t careful. But --force
on a remote feature branch is usually OK, just make sure to coordinate with anyone else using it. And, of course, if that’s only you, blast away.
git rebase
You probably know rebase already. If not, rebasing is equivalent to stashing your changes, resetting your branch to some new point, and re-applying your changes. Effectively, this makes it appear that your changes were derived from somewhere else (in other words, a new base). I mostly use this to update code on a feature branch:
$ git fetch
$ git rebase origin/main
You may have conflicts when Git re-applies your commits. You have to fix the files in an editor, then:
$ git add <path/to/the/file>
$ git add <path/to/the/other/file>
$ git rebase --continue
If you want to start over after a conflict, run git rebase --abort
.
git commit --amend
git commit --amend
causes Git to update the last commit instead of making a new one. I use this to fix typos or other small problems (CI errors, for instance). You could, conceivably, arrange your workflow around --amend
and continually update a single commit.
You can use --amend
like any other git commit
invocation. For instance, to add all working changes to the most recent commit:
git commit -a --amend
--amend
has a few options, which don’t come up every day but are worth knowing about.
For instance, a common problem after rebasing upstream changes is that your timestamps are earlier than the commits that precede them. It bugs me on occasion, so I’ll use --amend --reset-author
to reset the timestamp:
git commit -a --amend --reset-author
As you’d expect from the name, --reset-author
can also correct commits made from the wrong account.
--no-edit
prevents the Git from launching an editor, which is handy when you don’t want to change the commit message:
git commit -a --amend --no-edit
git rebase --interactive
git rebase --interactive
is not hard to use, but you may find it strange if you have not seen it before (at least, I found it strange at first). You run it like any other rebase, but stick -i
(or --interactive
) in the command:
git rebase -i origin/main
This opens your editor with a list of commits that looks like this:
pick 0d6a6c8 fixes
pick 49ce1ee more fixes
pick f677508 wip
This is, in fact, a script that rebase runs after you save and exit the editor. The default script adds every commit like a normal rebase would.
By the way, if you change your mind about the rebase, delete everything in the file, save, exit, and nothing will be done.
By changing the commands in the file you can do a great number of useful things. For instance, a really common use is to combine several commits into one:
pick 0d6a6c8 fixes
squash 49ce1ee more fixes
squash f677508 wip
I spelled it out above, but I normally use the short aliases for the commands:
pick 0d6a6c8 fixes
s 49ce1ee more fixes
s f677508 wip
squash
is for when you want to include each commit’s commit message. By default, the message for a squashed commit is a mechanical concatenation of the message from every included commit, which is a reasonable default but often a horrible commit message. You can change the message, but you can also use the first commit’s message with fixup
:
pick 0d6a6c8 fixes
f 49ce1ee more fixes
f f677508 wip
git rebase
runs commands in the order you place them, so you can re-order commits however you like. But it’s easy to create conflicts if a commit depends on changes that follow it. Likewise, it only runs the commands you specify, so if you remove a commit from the list, it will be gone (consider setting rebase.missingCommitsCheck
if that bugs you).
reword
is another useful command. It allows you to update a commit message. I use this often, because I apparently have a chronic condition that causes typos.
edit
is used to update the code in a previous commit. It’s helpful sometimes, but I usually prefer to make another commit and do a fixup
for simple changes (see below).
git rebase --autosquash
Sometimes I want to make a change that logically belongs in an earlier commit, but I don’t want to completely stop what I’m doing for an interactive rebase. The --fixup
and --squash
flags to git commit
are helpful here:
git commit -a --fixup=HEAD~1
This creates a commit with an auto-generated message like “fixup! more fixes”. That message is interpreted by an interactive rebase when auto squash is enabled and will cause the default command list to look like this:
pick 0d6a6c8 fixes
squash f5646ea squash! fixes
pick 49ce1ee more fixes
fixup 1efc224 fixup! more fixes
pick f677508 wip
Auto squash is disabled by default, so to get this behavior add --autosquash
to rebase:
git rebase -i --autosquash origin/main
Or, if you prefer, set rebase.autoSquash
:
git config --global rebase.autoSquash true
git reset
Sometimes, the best thing to do is scrap the history and only keep the code. This removes the two most recent commits:
git reset HEAD~2
This doesn’t touch the working directory, so no code has changed. It only resets the index and history. Afterward, you can rebuild your commit history (git add -p
may be helpful).
This can be used to split up one big commit into several smaller ones. Or combine several small commits into logical blocks. This works well with the break
and edit
commands in git rebase --interactive
if you want to redo an older commit but preserve what came after it.
Bigger picture
The goal is to make the commit history mean something in the future. The point should not be to rewrite history to look like you wrote perfect code, but simply to communicate what changed effectively. Dead-ends (code that was reverted before it was deployed) can give a future developer wrong ideas about how the software worked. “WIP” and “doh!” commits are noise to slog through.
I should also note that there are diminishing returns with history rewrites. Squashing small commits is usually helpful. Splitting a big commit into logical chunks is sometimes helpful. But, as in all things, use judgment. Spending hours tweaking your commit history is probably not worth it and might make it harder to understand.