0% found this document useful (0 votes)

32 views4 pages

Fast Intro To Git Internals

This document provides an overview of Git internals, explaining how Git functions as a database that tracks files (blobs), directories (trees), and commits. It emphasizes the importance of understanding these underlying concepts rather than just using commands, and discusses the roles of refs, the staging area, and the differences between merging and rebasing. Additionally, it includes commands for exploring Git and references to external documentation for further learning.

Uploaded by

Deepak D

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

32 views4 pages

Fast Intro To Git Internals

Uploaded by

Deepak D

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 4

A Fast Intro to Git Internals

Many git tutorials focus on a set of commands and instructions to “get you up to speed” in git,
without addressing the underlying concept of “how git works”. While the commands are
important, I feel it’s more important for you to understand what’s going on behind the scenes.
(Full reference material is included in the appendix at the end of this document. Most diagrams
are taken from Pro Git.)

At a high level, git can be thought of as a database/filesystem/backing store that remembers a

truckload of details about your code base. This information is called the git repository, and
contains three types of content: blobs, trees and commits.

Blobs1 are essentially “files” in git. Each blob is indexed by a SHA1

hash or checksum, so if the same file appears twice in your directory, it
will resolve to the same blob in git. And for the record, git assumes that
there will never be a hash collision. Relax, because it’s true. Seriously.
If you don’t believe that, please read up on it until you do.
Blobs are normally referenced by their hash, although you will rarely
need to type these hashes in.

Git uses trees to store directories. Each tree contains a list of entries,
which are either blobs (files) or other trees (sub-directories). Like blobs,
trees are stored by their hashes. This is a significant detail.... because
a single change to a single file (say in root/sub/myfile.txt) will cause
myfile.txt’s hash to change, which will cause sub’s content list to change,
which will cause root’s content list to change, which will cause root’s hash
to change. Thus, the hash of a tree represents the entire state of every
single file in that tree.

The next object is the commit, which is a snapshot of a tree along with
some additional metadata. As discussed above, the tree reference
represents the entire state of every single file in the tree. The metadata
provides more context for the commit, including the author, comments,
and one or more parents of the commit. Like everything else we’ve seen,
commits are referenced by their hashes.

Git maintains a set of refs, which are human readable names that resolve to specific commits.
For example, the HEAD ref points to the most recent commit in your currently checked out
branch. When you check in another commit, the HEAD ref gets “auto-promoted” to point to your
new commit. Most git operations use the HEAD as their default target. Refs are also used to
identify your own branches, and the current branch is updated to the latest commit each time
1 The diagrams on this page are taken from Pro Git, which is highly recommended for further reading.
you commit, too.

As you work, git maintains three different “views” of your

filesystem. Simultaneously. For some, this is a source of
confusion ;) As you work, a single file (readme.txt) might
be in all three locations, and might be different in each
location.

The git repository is the commit (and corresponding tree)

that you last checked out. Any changes you make to files
on your disk are reflected in the “working directory”. You
can promote these local changes into the staging area
(using “git add”) as often as you like. Then you commit all
of your staged files to the repository.

The staging area is also called the index or the cache. The repository is sometimes called the
tree or the database. Sigh.

When you’re using git, you don’t normally think of blobs or trees2. Instead, user-facing
commands deal with commits and refs, and most of the work you do in git involves traversing
the DAG of commits in your repository, or adding new nodes into that DAG. The DAG always
starts with the “empty” tree, so if you want to think of this in terms of pointers, that would be the
null pointer.

2 If you ever want to look at the blobs or trees in your git repository, you can use “git rev-list --
objects --all” to see a huge list of objects that git is tracking. To see a single object, you can use
“git show <object>”. And if you don’t feel like typing in the whole hash, you can just type the first
couple of characters.
Branching, Rebasing, Merging

m To the left is a diagram of a typical git scenario, where a dev

has created a branch called “mine”, currently pointing at the
C C C same commit as “master”.

m
m The dev does some work and commits C6, C7, and C8.
In the meantime, other users have updated “master” with
C C C C C C C3, C4, and C5, and those commits have been pulled
C C C down into the local master branch. In order to resolve this
situation, the dev can either perform a merge or a
m rebase.

If the dev uses “git merge master” from the “mine”

m branch, the result will look like this. Note that “C9” is
a commit that contains all of the merged source,
C C C C C C which may include “new” code that was introduced
C C C C to resolve merge conflicts.

git checkout m

The alternate approach is to use

m a rebase, which creates a new
commit for each rebased commit.
C C C C C C
Thus, C6’ will contain (mostly)
C C C C C C the same changes that were in
C6, C7’ will match C7, and so on.
git checkout mine; git rebase master
mi
For most operations, a rebase is preferred to a merge:
● It remembers each of your commits.
● Your commits will always show up as the last in the list (“the cream rises to the top”)
● Note that the old commits (C6..C8) are no longer referenced by any refs, so they are
now available for “garbage collection”.

The notable exception is that you should not do a rebase if other repositories have seen
your commits. This might happen if others were basing their repositories on yours, or if you
had pushed your own commits “upstream”.
Final Notes...
Here are some extra commands to get you in trouble help you explore git in all its glory...
● “git rev-list --objects --all” will display all of the objects in your repository.
● “git show <object>” will let you see one of those objects in detail.
● “git fsck --unreachable <sha1>” will show you all the “orphans” that are waiting for
garbage collection.
● “git reflog” will show you a list of everything you’ve done. Ever. It’s a cool tool that
can help you “undo” your recent activity, and find that code you thought you had lost.

External Git documentation

● This thread contains a good overview of the “fourth” object type in git: the ref. (Note
that refs are intentionally “glossed over” in this discussion.)
● A tour of git: the basics: General, easy to read tutorial for getting started
with git.
This one is good for basic "commands to get up and running", but none of that content is in the doc we're
writing, so it's good non-overlapping information.
● git ready: General introduction and "cookbook" reference for git.
http://gitready.com/beginner/2009/02/17/how-git-stores-your-data.html comes pretty close to the content I
want but really doesn't get deep enough into it...
● Git for Computer Scientists: Great explanation of the architecture on which
Git is based.
This. I believe this doc is the one I used to finally understand the key concepts in git.
● Git Magic: Yet another Git tutorial.
I like this one but need to spend more time reading it.
● GitCasts: Screencasts on Git.
For the “video-inclined”.
● Pro Git: Freely available book on Git.
This was looking really good in the 'what is a branch' section but then it goes on to encourage the user to
merge without explaining why rebase is better. If the user bailed early on this doc they'd have some of the
foundation they need but then do the wrong thing (merge) repeatedly.
● Git Reference: Quick reference that links to the Pro Git book.
○ Very command-line oriented (fewer pictures of the tree, more
command-line examples)
● Visual Git Reference: A visual Git Reference, explaining quite a few
commands visually.

Git Basics and Branching Guide
No ratings yet
Git Basics and Branching Guide
30 pages
Project of Networking
100% (1)
Project of Networking
5 pages
Gittutorial-2 (7) Manual Page
No ratings yet
Gittutorial-2 (7) Manual Page
8 pages
IT General Controls and IT Application Controls
No ratings yet
IT General Controls and IT Application Controls
2 pages
Git & GitHub Basics Guide
No ratings yet
Git & GitHub Basics Guide
52 pages
Git Handbook: Beginner's Guide
No ratings yet
Git Handbook: Beginner's Guide
6 pages
John Wiegley - Git From The Bottom Up
No ratings yet
John Wiegley - Git From The Bottom Up
28 pages
Git From The Bottom Up, by John Wiegley
100% (2)
Git From The Bottom Up, by John Wiegley
29 pages
Git Commands
No ratings yet
Git Commands
5 pages
Git From Bottom Up
No ratings yet
Git From Bottom Up
31 pages
How To Use Git and Git Workflow: Create A New Repo On Github
No ratings yet
How To Use Git and Git Workflow: Create A New Repo On Github
26 pages
Precedence) : /etc/gitconfig - System Git Config /.gitconfig - Global .Git/config .Git/config /etc/gitconfig
No ratings yet
Precedence) : /etc/gitconfig - System Git Config /.gitconfig - Global .Git/config .Git/config /etc/gitconfig
5 pages
Merge Conflicts: Git Mergetool - Tool Meld
No ratings yet
Merge Conflicts: Git Mergetool - Tool Meld
6 pages
1.1. What Is A Version Control System?: Source Code
No ratings yet
1.1. What Is A Version Control System?: Source Code
13 pages
Introduction To GIT-1
100% (1)
Introduction To GIT-1
31 pages
Git & Gerrit for All Skill Levels
No ratings yet
Git & Gerrit for All Skill Levels
108 pages
Git - Getting Started - Basics
No ratings yet
Git - Getting Started - Basics
5 pages
Git Github PDF
No ratings yet
Git Github PDF
77 pages
Git Basics for Coding Beginners
No ratings yet
Git Basics for Coding Beginners
20 pages
Gittutorial (7) Manual Page: The Git User's Manual
No ratings yet
Gittutorial (7) Manual Page: The Git User's Manual
13 pages
Git Guide
No ratings yet
Git Guide
11 pages
Git and Github
No ratings yet
Git and Github
9 pages
Git Quick Guide
100% (1)
Git Quick Guide
41 pages
Git PPT Powerpoint
100% (2)
Git PPT Powerpoint
15 pages
Introduction To GIT
100% (1)
Introduction To GIT
25 pages
1000 Java Interview Questions-5
No ratings yet
1000 Java Interview Questions-5
112 pages
Fast-Version-Control: Search Entire Site..
No ratings yet
Fast-Version-Control: Search Entire Site..
9 pages
Slides
No ratings yet
Slides
245 pages
Easy Version Control With Git
No ratings yet
Easy Version Control With Git
18 pages
Git & Github: Basics of Distributed Version Control
No ratings yet
Git & Github: Basics of Distributed Version Control
44 pages
Git Concepts Simplified
No ratings yet
Git Concepts Simplified
27 pages
GIT - DevOps
No ratings yet
GIT - DevOps
11 pages
Git Commands & Operations Guide
No ratings yet
Git Commands & Operations Guide
23 pages
Giit
No ratings yet
Giit
89 pages
Linus Torvalds For Linux Kernel Development
No ratings yet
Linus Torvalds For Linux Kernel Development
8 pages
Compact Performance CP Fieldbus Node 13: Programming and Diagnosis
No ratings yet
Compact Performance CP Fieldbus Node 13: Programming and Diagnosis
103 pages
Advanced Git For Beginners: Derrick Stolee Microsoft @stolee
No ratings yet
Advanced Git For Beginners: Derrick Stolee Microsoft @stolee
34 pages
Maven vs ANT: Key Differences & Features
No ratings yet
Maven vs ANT: Key Differences & Features
11 pages
AI & Machine Learning eBook by Altair
No ratings yet
AI & Machine Learning eBook by Altair
122 pages
Git Notes ?-1
No ratings yet
Git Notes ?-1
71 pages
Python for Analytics Course Guide
No ratings yet
Python for Analytics Course Guide
13 pages
Introduction To Git: Dr. Noman Islam
No ratings yet
Introduction To Git: Dr. Noman Islam
36 pages
Yosys Manual: Claire Xenia Wolf
No ratings yet
Yosys Manual: Claire Xenia Wolf
278 pages
CEH Recon & Enumeration Guide
No ratings yet
CEH Recon & Enumeration Guide
7 pages
3G Wireless Technology Overview
0% (1)
3G Wireless Technology Overview
28 pages
Windows VSS Error Solutions
No ratings yet
Windows VSS Error Solutions
9 pages
Git and Github Basics
No ratings yet
Git and Github Basics
21 pages
Chapter 03
No ratings yet
Chapter 03
21 pages
Mine Safety Monitoring System With Zigbee GSM
No ratings yet
Mine Safety Monitoring System With Zigbee GSM
10 pages
Informatica Partitions
No ratings yet
Informatica Partitions
11 pages
Facebook As A Social Media and A Business Platform
No ratings yet
Facebook As A Social Media and A Business Platform
6 pages
Git Notes
No ratings yet
Git Notes
34 pages
Kube
No ratings yet
Kube
20 pages
Gateway I IFC User Manual
No ratings yet
Gateway I IFC User Manual
30 pages
Git Lec1
No ratings yet
Git Lec1
27 pages
GIT Comunity Book
No ratings yet
GIT Comunity Book
132 pages
MT6752 EMMC Partition Layout
No ratings yet
MT6752 EMMC Partition Layout
6 pages
A Visual Git Reference
No ratings yet
A Visual Git Reference
7 pages
Imran Anwar SE
No ratings yet
Imran Anwar SE
2 pages
Unit of Software That Packages Up Code and All Its Dependencies
No ratings yet
Unit of Software That Packages Up Code and All Its Dependencies
4 pages
The Essential of Software Requirement
No ratings yet
The Essential of Software Requirement
12 pages
Git 31 Fef
No ratings yet
Git 31 Fef
15 pages
Custom MK-SS808 Image
No ratings yet
Custom MK-SS808 Image
4 pages
Git
No ratings yet
Git
5 pages
Encoder Selection Guide
No ratings yet
Encoder Selection Guide
4 pages
Week 5 Git 1
No ratings yet
Week 5 Git 1
25 pages
Profile Summary: Pallavi Kumari Pandey
No ratings yet
Profile Summary: Pallavi Kumari Pandey
2 pages
Linux Networking SDFSDF
No ratings yet
Linux Networking SDFSDF
4 pages
Python Basic Codes
No ratings yet
Python Basic Codes
8 pages
Important Questions
No ratings yet
Important Questions
6 pages
Chap 11
No ratings yet
Chap 11
1 page
Help Line No: 18003455384 (Toll Free) : State Name District Name Block/Municipality Municipality Name Ward No. Select by
No ratings yet
Help Line No: 18003455384 (Toll Free) : State Name District Name Block/Municipality Municipality Name Ward No. Select by
1 page
Git and Github
No ratings yet
Git and Github
21 pages
Black Document
No ratings yet
Black Document
12 pages
Jquery 17 Visual Cheat Sheet
100% (1)
Jquery 17 Visual Cheat Sheet
8 pages
Eaton Network m3 User Guide
No ratings yet
Eaton Network m3 User Guide
294 pages
Git Cheatsheet
No ratings yet
Git Cheatsheet
6 pages
FCC Install Guide - With Dual Lane Support Updated
No ratings yet
FCC Install Guide - With Dual Lane Support Updated
35 pages
GIT and GIThub
No ratings yet
GIT and GIThub
5 pages
Niyati
No ratings yet
Niyati
6 pages
Ap6398p Evb
No ratings yet
Ap6398p Evb
6 pages
Version Control Systems
No ratings yet
Version Control Systems
15 pages
Introduction To Git: Arunan J Neeraj N Lokhith
No ratings yet
Introduction To Git: Arunan J Neeraj N Lokhith
15 pages
An Empirical Study of DevSecOps Focused On Continuous Security Testing
No ratings yet
An Empirical Study of DevSecOps Focused On Continuous Security Testing
8 pages
Books
No ratings yet
Books
2 pages
Read Me
No ratings yet
Read Me
1 page
Trip
No ratings yet
Trip
1 page
Git Cheat Sheet - Branching Strategy
No ratings yet
Git Cheat Sheet - Branching Strategy
4 pages
Devsecops Proposal - 2024 ..
No ratings yet
Devsecops Proposal - 2024 ..
1 page

Fast Intro To Git Internals

Uploaded by

Fast Intro To Git Internals

Uploaded by

A Fast Intro to Git Internals

At a high level, git can be thought of as a database/filesystem/backing store that remembers a

Blobs1 are essentially “files” in git. Each blob is indexed by a SHA1

As you work, git maintains three different “views” of your

The git repository is the commit (and corresponding tree)

m To the left is a diagram of a typical git scenario, where a dev

If the dev uses “git merge master” from the “mine”

The alternate approach is to use

External Git documentation

You might also like