Internals of Git Object Storage
To understand the implementation further, you will need a rough idea of how Git stores its data. Here’s a brief introduction to the same. Everything about Git is stored inside a directory called the gitdir
aka .git
.
Example contents of the
.git
directory
Think of Git like a database. Git stores its data in the form of Objects. There are three types of Git objects: Blob, Tree, and Commit.
- Blob: the object type used to store the contents of each file in a repository.
- Tree: the object type used to store the hierarchy between files in a repository.
- Commit: the human-readable object type used to store the snapshot of a tree.
Each object is stored as a file. Most of the data stored by Git is compressed to save space. For compression, Git uses zlib
. The format of a blob object is:
blob <blob_length>\0<blob_content>
Here, <blob_length>
denotes the length of the file’s contents and <blob_content>
is the zlib
compressed contents of the file. The files are stored according to the SHA-1
hash of the compressed contents. Note that to further save space, Git
packs the object files periodically into “packfiles” which have their own format.
Next
Move on to [[git/git-in-rust]] to read about some implementation details.