I used CVS and ClearCase before moving into Git, and it took me some time to adjust to the fact that the cost of branching in Git is much much less than ClearCase. And getting into the “distributed” mindset didn’t happen overnight.
I used CVS and ClearCase before moving into Git, and it took me some time to adjust to the fact that the cost of branching in Git is much much less than ClearCase. And getting into the “distributed” mindset didn’t happen overnight.
That other people would care as much for a clean history like I do. Specifically, opening branches and leaving them open forever without merging them back into main, many useless commits rather than squashing or amending, or making branches-of-branches-of-branches. Drives me nuts
In many providers it’s possible to set up an automatic squash policy when merging to main. At our company the git history is just linear with well defined commits.
I don’t think this is necessarily better. Some branches/projects are big enough that there are meaningful commits that should be made inside the project.
Yeah I agree but in my experience developers seem to struggle with “keep important things in history; squash unimportant things”. An open “merges allowed” policy leads to people unthinkingly merging branches with 50 “typo”, “fix” commits for a 100 line change.
Dunno what the answer is there.
The answer is Gerrit and learning interactive rebase.
How does that help?
The key thing gerrit provides is relation chains. You can push to the server your local branch and it will make a “patch” (like a pull request) for each of the commits in your branch. The patches are automatically put into a relation chain which lets reviewers go through them in sequence. Also your CI can test them together.
The idea is you need to locally make your commits ready for master. This is where interactive rebase comes in. You’ll have a normal local working branch and when you’re ready for review you use interactive rebase to squash some commits together, redo the commit message on some, change the order, etc. Basically you clean up your working branch into a series of commits ready for main.
Being able to easily push a relation chain of reviews up makes it way easier to make commits that land on main that just do one thing.
The second part of the solution gerrit provides is patchsets. When you need to edit a patch to deal with review comments, you actually rewrite your local history using git commit —amend or interactive rebase and push to gerrit again. In gerrit this will make a new patchset of your patch that you can diff between, see comments on, etc. It works very well for dealing with review feedback and for reviewers to quickly rereview.
It achieves this magic through a change-id that is added via git hook to the commit message. Even if you rewrite your history and your commit id changes it will still consider the rewritten commit to be the same patch as long as the change-id stays the same.
It’s pretty hard to just explain it like this. Using the workflow for a bit makes it much easier to see the value it has.
Yeah I get what you’re saying. Gitlab can pretty much do that too, you just need a branch & MR for each commit, and then you tell it to merge each branch into the previous one. It automatically rebases them when their dependency merges.
Definitely more tedious to set up than just pushing one branch though. Maybe I should make a tool for that… Does Gerrit test each patch in CI sequentially or just all of them together?
But in any case that wasn’t really the problem I was talking about. What I’m saying is that whether or not you should squash a branch depends on what that branch is. Neither “always squash” not “never squash” are right. It depends. And developers seem to have a real problem with knowing when a change is important enough to warrant a commit.
Though I suppose if people have to actually review each commit they would get a lot more push-back to “fix fix fix” type commits so maybe you are right.
Does Gerrit require each individual commit to be approved or can you just approve the whole branch/changeset?
Yeah you’ve gotten the idea I was going for. The workflow of gerrit incentivises forming commits with single ideas. It’s not always squash or never squash, it’s squash when it makes sense to. You still have to stay on top of reviews to get devs to do it, but it’s so much easier that there’s no excuse.
For CI, it depends how you set it up but usually you would test each commit in the relation chain to ensure they’re cherry pickable, back portable. It also helps to push devs to make commits single well contained ideas. Patches higher up in the relation chain will build with the patches below it.
Often you’ll push a relation chain up, get reviews on all of them, get green from CI on all, and then merge them all together. Normally you would set it up to rebase the commits and ff merge them to main.
I guess it’s dependent on project, but IMO if a commit is important enough to be in the main branch, then that should have its own merge request. I like it when all commits in the main branch have gone through all the build pipelines and verification processes.
But if having separate MRs is undesirable, then you could always overrule the squash policy. I know it’s possible on Gitlab at least.
Commits should be reasonably small, logical, and atomic. MRs represent a larger body of work than a commit in many cases. My average number of (intentionally crafted) commits is like 3-5 in an MR. I do not want these commits squashed. If they should be squashed, I would have done so before making the MR.
People should actually just give a damn and craft a quality history for their MRs. It makes reviewing way easier, makes stuff like
git blame
andgit bisect
way more useful, makes it possible to actually make targeted revert commits if necessary, makes cherry picking a lot more useful, and so much more.Merge squashing everything is just a shitty band-aid on poor commit hygiene. You just get a history of huge, inscrutable commits and actively make it harder for people to understand the history of the repo.
Well the MRs in the teams I’ve been working in have been small and mostly atomic. They’re focused on solving only one thing.
The team I’m currently working now in was bad at this before and often bundled way too many things in a single MR. It lead to overly long review processes and was error prone. It was too tough for the reviewer to get an understanding of what was going on.
Since we made the habit to make smaller MRs we have had much less of those issues.
If the MR is anything bigger than a completely trivial change in a file or 2, it most likely should be broken into multiple commits.
A feature is not atomic. It has many parts that comprise the whole.
In that case the feature is multiple MRs.
That’s excessively bureaucratic to the point of being useless in most cases.
Hard and fast rules are generally bad and “squash everything” is pretty much a by definition hard and fast rule with the result being “I’m just not going to care that much about my commit messages.”
I’m glad I was able to establish a very mindful and structured approach to Git history in my team and project.