As software developers, we rely heavily on the amazing open-source community for robust, free tools that enable us to code for a living. I feel that the best way to express our gratitude is to contribute back to the global community that has given so much to us. I started working on a project in 2017 called baby-git, and v0.1 of the project has been released on our BitBucket page.

Baby-git is a fully documented codebase of the very first commit of Git, introduced to the world by Linus Torvalds (the creator of Linux) in 2005. For those of you non-developers, Git is a Version Control Tool (the most popular one for the past several years), which means a tool that teams of developers use to write, manage, and track the code for their projects. Git is ubiquitous in the software development world. The vast majority of developers use Git as their version control system of choice.

Git itself is a piece of software. It has a codebase (a set of files and folders containing code) written in the C programming language. Git is an open-sourced, distributed project, which means it’s free for anyone in the world to download and use, and anyone in the world can contribute to it’s development. Funny enough, the Git development team uses Git itself to track the development of their own codebase.

Anyone in the world can download Git’s codebase here for free. From there anyone can browse through the directory structure of the code, and open specific files to try and figure out how the code works. But since Git is so built-out, the codebase is quite large and complex. It would be almost impossible for a novice to probe through it and figure out how it works. So if beginner or intermediate coders are interested in learning how an incredibly successful tool like Git works at the code-level, how should they start?

This brings me to one of Git’s core features - the ability to retrieve the exact state of the codebase at any point in the project’s history, all the way back to the very first (and simplest) version of the code. That means that with a single command, you can retrieve the exact versions of all files and folders of the project at it’s inception. I did that with Git, and what I found was amazing.

It turns out that the very first version of Git comprises only 10 files, totalling approximately 1000 lines of code. This is extermely “small” code, and it actually works. The kernel of Git’s core functionality is described beautifully in those 10 files. I was able to open them up, read through them and understand how it works, and so can you.

But we’ve made your job even easier. We’ve thoroughly documented the codebase using inline comments, so that you can open up each file, read through, and hopefully learn how a tool like Git is put together. The current documentation is definitely geared towards folks that are familiar with coding concepts, but in future releases we will try to make the project more and more friendly for novice and even non-coders.

To sum things up, I suspect that the vast majority of Git users are basically oblivious to the inner workings of a tool that we use every day. Baby-git addresses this by clearly and thoroughly documenting the first (and correspondingly simplest - it is only ~1000 lines of code!) version of the tool for ease of understanding by the average developer.

As this is an open-source project, we welcome anyone reading this to check out Baby-git, take a read through, and if you have any suggestions or improvements either drop us a line via our contact form or submit a pull-request to our repo!