- cross-posted to:
- [email protected]
- cross-posted to:
- [email protected]
Git Commit Creation
This is an article in which I explore the details and thinking that goes into how you should create git commits, and why. I like to think of it as the article I wish existed when I was just starting out over 20 years ago.
I wanted to cover all the things that you should think about at a high level. That way it at least could work as an entry point to deeper exploration of the particular areas if the reader isn’t completely sold or they want to just gain a deeper understanding. While at the same time trying to provide enough details to show why and how these choices are valuable. This is always a tricky balance.
Anyways, I would love any feedback on thoughts on how this could be improved.
Thanks
Aah. I assumed linting was part of the build also. My bad. I did understand the idea you were mentioning. Just that assumptions kind of threw me off.
I wanted to ask something related to that. As you mentioned, git takes a snapshot of the repo on every commit. So splitting up the bug fix and other activities means you have 3 or 4 commits instead of one. Let us say we are dealing with a very large repo. This does not look ideal in that context right? So do you think the way you proposed is only suitable for smaller repos?
Actually, having more commits is negligible because of the way that Git stores the snapshots behind the scenes. Specifically, it uses a content addressable key value store. So the storage is bound to the file changes irrespective of the commits.
The commits simply hold the sha of each of the files. Technically, it is a bit more complicated than that. But from an understanding of size implications and what it is bound to that mental model should get you there. It also does additional smart things in packing this key value store to store things more efficiently that also help.
If you want to start understanding more about the internals of kid and how it actually stores stuff. The Pro Git book has a Git Internals section, https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain which is a great place to start.
I think I got the idea. So essentially a new copy of the file is created and stored only if there is a change, else it just refer to the older SHA. Am I right? Now I understand why LFS was needed for binaries, else it createds a lot of storage problems, but not the huge monorepos.
I’m not a developer, but a design person who covers much more including architecture. But in my org I happen to teach developers how to use Git. Strange, I know. But that is the case. It gave me a good opportunity to learn Git in depth.
I went through your blogs and patch stack workflow. I have to say that I have not been happy with the branching workflow and I always felt that is not the best (I agree to the point about “unjust popularity”). The patch stack workflow makes more sense to me. Unfortunately we won’t be able to adopt, since getting everyone to Git itself was a huge effort. Also developers are not that keen into creating good code, but just working working code. I’m extremely frustrated with that.
Also your blog design is really good. I love it. I always wanted to create something like that. But never managed to sit down and do it. Can you give me a brief about the tech stack used for the blog?
Do you use RNote for diagrams? The style looks familiar. Or is it something else?