This guy doesn't understand rebase. Rebase is an organizational tool. it's housekeeping for keeping your commits clean. One reason to learn how to do rebase is so that your feature branch is tidy so when you merge it to main, main itself is tidy. Another is that as you perform the exercise of cleaning up your commits you are performing code review, something that, up until this point, you've just been throwing at your co-worker without much thought as soon as all the tests pass.
"But my feature branch gets squashed anyway"
1. I believe that in itself is a mistake, especially as feature branches get large and, though you say you're all for small feature branches, your commit history says you do something different.
2. You don't clean up that giant merge commit message with all those "WIP" "fixed stupid mistake" comments in there.
Use rebase to keep your workspace clean and main/master comprehensible to your colleagues.
I finally understand why some people are so against rebase. They're doing it wrong.
I've always heard people talk about how it doesn't scale, but I use rebase like 99% of the time, and have worked on projects with hundreds of ICs. This is the first time I've seen someone explain it in a way where I get it. NO I'M NOT FORCE PUSHING TO MAIN YOU SILLY NILLY! Turns out I'm "squash rebasing." I guess I didn't know I need to specify that.
I do it slightly differently tough. I use git commit --amend to build up a single commit as I go.
git checkout -b blah-feature
# do some work
git commit -m "Main description of my feature"
# do more work
# no -m, and then I just add bullet points for each subsequent change in vim (example below)
git commit --amend
Then, once I'm ready to make a PR I do the following:
# pulls from remote without merging
git fetch origin main
# adds my single commit to the end of the current main
git rebase main
# push up feature branch for code review
git push
# get yelled at about --set-upstream, and copy/paste that command :-)
My commit messages typically look like:
Add some new feature
- do some sub task
- do another sub task
- ...
I don't agree that folks who rebase multi-commit branches are "doing it wrong". Even if there are multiple stops along the way, that should not be considered a bad thing if it's understood what it is doing.
The way I look at it, lets say I have 10 commits. If I rebase main, commits 3, 4, 7, 10 have "conflicts" with the code that is on main now compared to when I started writing my feature branch and making commits. But, now, I have an opportunity to update code in each of those commits as if I was writing it based on what is now currently on main. If done like that, incrementally, it usually doesn't cascade into conflicts on each commit.
The problem, IMO, comes when at commit 3, the dev says "Oh, I did this like this in commit 10, so let me just put that solution here in commit 3 to resolve this merge conflict". Now, instead of 4 conflicts, you have conflicts on all of the commits between 3 and 10 (because the dev in effect moved the fix from 10 up 7 commits from where it originally was committed). Instead, each conflict resolution should aim to maintain the code as close to the committed code as possible while integrating the code from main. That way also, the feature branch commits still reflect the iterative process that having multiple commits is designed to show.
I don't embrace a FULL squash rebase, but I do embrace cleaning up your branch commits with an interactive cleanup rebase (not on main, just going through the commits for the branch and squashing any minor fixes that belong with the previous commit, etc.) THEN, once you have a clean feature branch, rebase main. The feature branch may now have, instead of the 10 commits above, eliminated 6 commits that were just things like minor test fixes, typos, etc, and now only has 4 commits total. Instead of 4 conflicting commits, it may now only have 1 or 2, making the rebase simpler as well. And the branch still maintains the traceable history of the development of that feature (assuming good commit messages were used, which is not something the original poster values either).
Many times have I seen a green developer throw up their hands at a rebase attempt, after which we learn they were doing this:
git checkout master; git pull
git checkout -b fb
git commit
git commit
# new changes arrive on master branch
git checkout master; git pull;
git checkout fb; git merge master
git commit
# "went to the git brownbag and heard about rebase for the first time,
# missing that part up front about not intermingling merges with rebases
# and not having a good mental model of git
git rebase master
# WTF conflict everywhere! rebase sucks
The flow you have described works (besides the last ‘git rebase master’), though. When ‘git merge master’ is run, the developer either get conflicts (which can be then fixed) or not. After that ‘git push’ and you can open a PR. I certainly don’t see anything wrong with that flow.
Yah, I don't think anyone minds those kind of rebases. It's when they're done on shared branches like master that they're incredibly messy and dangerous.
That totally makes sense. I never really dug in. Mostly I'd tell people I always rebase in passing and get a glare or snarky comment, but never bothered to argue about it because it works for me and never caused issues.
> I use git commit --amend to build up a single commit as I go.
While this is a sensible approach, it doesn't work in scenarios where you essentially have to "test in production" to actually test things (Jenkins, looking at you). I've experienced some of that in the real world, and combined with the inability to force-push, it seems like the squash rebasing is the most sensible thing to do to keep main clean within these constraints.
Because they do different things. It's right there in his post:
> # adds my single commit to the end of the current main
The biggest benefit to this IMO is that you can resolve conflicts in YOUR branch, get it all cleaned up, and then when you merge there are no conflicts. This allows you to test any changes made during conflict resolution in your feature branch still.
> Add everything I’m working on (new and edited files).
Bad idea. Extraneous cruft that isn't caught by .gitignore will leak into the repo. Always run git diff and git status first to see what you are about to add.
I hear you. But nothing goes directly into main. The working branch is not sacrosanct you know what I mean? I'd rather clean up anything that leaks in before merging and a good .gitignore can protect against most noise.
Sure but it goes in the repo. Forgot to add .env to the .gitignore? You probably just committed a secret. Sure you can force push to get rid of it but if you're using Github it's still saved so it needs to be rotated now.
What’s “the repo” here? What I commit goes into my copy of the repo. Only what I push goes into any repo anyone else sees. Before pushing I always go through what’s in the branch and clean up/rebase etc. Sure it’s easier to accidentally push a file you shouldn’t have if you have added it locally but committing alone doesn’t necessarily mean you need to rotate a secret.
That is a case where (as you note) you'd need to do some sort of destructive fix to eliminate the file from your local repo. Whether it's a rebase or something else, a person then needs to know "the advanced parts" that the blog post is advocating aren't really needed if "git is done correctly". Better to just not commit files/hunks accidentally in the first place - I know, it's not a perfect world, yada yada - just trying to provide the counter point to the counter point.
Yeah I’d say interactive rebase and soft resets is part of my “limited toolbox” for git that I wouldn’t go without. (I don’t stage hunks, don’t merge in either direction etc, so I leave some other parts of the larger toolbox alone).
That's not a reasonable argument. The problem is pushing confidential info into a repository. It matters nothing what the branch you push it is called.
> FWIW - we run secret detection in our trunk check precommit action - so we make sure that secrets are never committed into local or remote branches.
Irrelevant. You're describing a failsafe. It's like claiming that you don't need to care about speed limits because a road has guardrails. The whole process is broken if it fails to address the main reason confidential info can be pushed into repositories.
> you need to re-clone from scratch — even if you did nothing wrong.
I don't think you will need to reclone if did nothing wrong. You can always use `git reset --hard origin/...`, and if that does not work the you definitely did something wrong.
> But in Linus’s own words, Git is “the information manager from hell.”
That git is not the same git we have today, and it was handed over to another person quite early in the development. Although I agree that git ux is sometimes confusing.
> Limit your Git actions ... for peak Git efficiency
I pretty much disagree, if I can give my two cents, read the manual. Git have some really handy tooling that can help with non-git issue (e.g.: `git bisect`). Limiting your knowledge brings no benefit.
I like some point of the text, but overall I don't like the premise. it exaggerate a lot a problem to prove a point.
I'll always be a rebase diehard, but the core message here is great. If you have to do more than a little bit of git surgery, you're probably doing something else wrong.
Without rebasing, tools for managing PRs might show the merged mainline commits in the PR.
Some times they are described as such “merged master into feature” and can be avoided if you review the PR per commit. But more often I want to review the PR as a whole, and then the tool fails to show a good diff of what’s actually developed in the PR. This to me is a much larger problem than the log pollution, which can be solved by squashing.
Other than that (missing the bigger reason for rebase and focusing on a lesser argument in my opinion) I quite agree with the article.
> This isn’t grade school, you don’t have to show your work
As you become an efficient engineer, the path you took to get to the final state of a pull request becomes far less important — and is academically interesting at best. You shouldn’t have to show your work like you did in school. Land the feature or bug and move on to the next one. The code speaks for itself (alongside some judiciously placed comments).
> Having granular annotations of all your work is unnecessary, ...
This premise underlies the particular workflow that the posted article assumes, and all the described command+option incantations are directed to it.
But there is another, very different git workflow used by a project we've all heard of, and that is the Linux kernel core code. The trunk.io workflow is unsuitable for Linux due to different requirements. Some of which being:
1. Commitment to support for indefinite future.
2. Large, complex feature PRs.
3. Human review of PRs, by maintainers fully empowered to reject.
4. A low-level programming language, in which subtle bugs are easily introduced.
Also different luxuries:
1. Willingness to put off merge of a "hot" new feature indefinitely.
So, kernel PRs are structured as a linear series of numbered patches meeting the requirement that each step along the way compile cleanly. This is primarily to ease the task of the maintainer responsible for the subsystem involved, and who will have deal with the fallout of bugs introduced by the PR. Example:
I think I need to adopt a practice where if I open an article, like the one you linked, and the title is some alternative take presented as a universal truth without qualification ("Git commits are useless"), I immediately stop reading. Not because I think they are always wrong, though that is usually how they read to me, but because they seemingly fail to demonstrate any humility - a characteristic I think is probably necessary for a balanced take landing closer to truth. (I did read the intro paragraph under the assumption the title was just clickbait but things don't improve).
> Push my code to the remote; likely spinning up lots of machines in CI to check my work.
That's a good callout; something that I have noticed I frequently miss when iterating on my code is the cost of the CI/CD systems already set up. It's something to consider especially when iterating using a system like Sapling for Stacked PRs; each individual PR push may update a chain which causes a lot of wasted CI/CD time.
I feel that a lot of my colleagues use "git push" as a backup mechanism, but GitLab actually supports that workflow via `git push -o ci.skip` to distinguish between "yes, just back up my commits" versus "ok, I'm really ready to test them". I'm aware of the commit-message hack used by other CI systems, but I much prefer that out-of-band mechanic
Especially if you're using a finite pool of CI runners. Lots of companies do run their own CI runners either for added security, or to get persistent CI machines with a hot local cache so builds/tests are much faster. Then everyone's PRs are blocked waiting for CI machines just because of people's weird push workflows or micro stacked PRs
> The easiest practice to implement for peak Git efficiency is to stick to a subset of commands
This has been my experience as well.
Great article, thanks! I've been using essentially this same subset of commands for many years, and it's worked extremely well for me: does everything I need/my team needs, and avoids complication. I'm glad to have this as a reference I can point people to when they ask for git advice.
Trunk development is not about rebase, it’s about short-lived branches that get incorporated into the main branch soon. How they get incorporated (git merge is good enough) is not relevant.
I agree; maybe when git was new and actually being used in a decentralized fashion some of the more advanced operations were necessary. But with the typical checkout/branch/PR/merge flow from GitHub and others, I rarely need anything but git merge (with squash merging when merging a pr)
I personally am a proponent of rebase, but not all the time and not for every workflow. It has it's place though, IMO. I like/value clean history and good commit messages (although I admit I don't always succeeding providing that all of the time). I prefer to stage my commit pieces at the hunk level at the largest. I like to rebase my branches before merging them into release or main (depending on git strategy as a whole), but it isn't always necessary. I use interactive rebase from time to time to clean up my feature branches (especially longer running ones). This is all stuff I prefer in my own workflow.
That said, every time I try to really teach someone rebase, particularly a new dev, I understand why people shy away from it. I did for a very long time. So I totally understand and get why the above style workflow may terrify folks (or just seem unnecessary). It is easy to mess up and there are a lot of little gotchas if someone isn't careful. And worst of all, it can result in lost work (although even that is "usually" recoverable, but not always). I do think there are some benefits to it, and I think it is something that can be integrated into a dev's workflow a bit at a time. And it really doesn't take significantly more time, in my experience.
I'm not gonna argue here for adopting that. Except for "no commit messages", I'd be pretty ok with a workflow as outlined by this post. I do think folks should understand how rebase works, what commits will be moved/changed when they run a rebase (this is vital), and how to recover when a rebase goes bad (no, not reclone, not generally even delete branch and check it out again).
.
Couple random thoughts I try to communicate to folks who decide to utilize rebase more in their workflow:
- rebase often (if main updates often)
- if worried the rebase may be messy, create a temporary branch prior to starting the rebase at the feature branch HEAD - allows for an easy way back (and prevents lost code)
- don't rebase shared branches - this is a tool to use PRIOR to "sharing" (i.e. pushing) code
- squash/clean up unneeded commits before rebasing on another branch (this may bring it all the way down to a single commit, but for larger features, I think there is value in seeing the main decision points along the way)
- fix conflicts with the code at the specific commit you are on only, don't fix it with the eventual end result X commits down the line - this will generally avoid the dreaded "fix the same conflict over and over for each commit" problem some people encounter with rebasing
- remember rebase creates new history - it doesn't rewrite history (however, old commits will eventually be garbage collected)
- pro tip: understand how `rebase --onto` works, sometimes you shouldn't, or at least don't want to, take all of the commits
"But my feature branch gets squashed anyway"
1. I believe that in itself is a mistake, especially as feature branches get large and, though you say you're all for small feature branches, your commit history says you do something different.
2. You don't clean up that giant merge commit message with all those "WIP" "fixed stupid mistake" comments in there.
Use rebase to keep your workspace clean and main/master comprehensible to your colleagues.