Managing Pull Requests

In an effort to practice writing on a consistent basis and to help reenforce new skills I'll be picking up working at Dynasty, I'll be trying to explain one new concept I've learned each week.

This week I'll be talking about git merge, git rebase, and pull requests.

Motivation

After finishing up some smaller assignments, I was given a larger assginment that took multiple days to complete. During this time, the master branch of the repo I was working continued to be updated, meaning my branch was out of date. Luckily, I thought, this the quintessential problem Git solves and I know how to Git!

After fixing merge conflicts, I pushed to my local branch. I was done, or so I thought.

When my manager took a look at my branch, he found the branch to have +10,000 modified lines of code! The code that was actually relevant to my feature were buried with the new changes (a massive code refactor) from master, making my pull request unreadable. He suggested to use git rebase instead of git merge to combat this issue.

The Idea

Let's put the idea of Git aside for a second. Imagine you are proof-reading a friend's essay. Right as you finish, your friend gives you another revised essay saying the first was no good, please use this version. What is the easiest way to meet your friend's request?

To begin with, I'd imagine you would try to take all corrections you made from the first essay and try to apply them to the second essay. You drop a spelling correction cause it's corrected in the new version. Another correction about flow needs to be reevaluated as the structure of the new essay is different.

The idea with rebasing is similar. You take each commit you made in one branch and apply them one by one on top of another branch. You'll have to handle any conflicts that come up with one commit before moving on to the next.

But Wait...

Isn't a merge the same thing as a rebase but you apply all commits together? How does this solve the polluted pull request issue? After doing some more digging, I found this stack overflow post (created in 2014 but still active in 2020)!

As it turns out, when you make a PR on GitHub, it doesn't keep track of new changes to master. In my specific case, I created a wip PR branch as I wanted to share progress in the middle of my work to see if I was on the right track. This had the unintended consequence of locking the PR to only diff with that older version of the codebase. Later, the merged in refactored code counted as my changes as it was looking at an older version of master.

Conclusion

To be completely honest, I (and apparently BartusZak) still haven't clearly worked the mechanics of why rebasing combats this issue. I assume it has something to do with how a rebase will just drop the reference to the old version of master and start with the new master. This somehow forces GitHub to compute the diff on the most recent version of master instead. But at the moment, I feel understanding this completely doesn't seem necessary for working knowledge of Git. I'm sure with enough time working with Git, the answer will show itself.

I started writing this post thinking I'll be explaining fairly boilerplate usage of git rebase, but I ended up discovering an interesting point on how GitHub handles PRs, so that's a win in my book. Until next time.

[home]