Git
TL;DR
git clone <url>git add .
git commit -m "make changes"git pull
git pushIntroduction
Git is truly fantastic software, and potentially my all-time favorite tool.
It simultaneously solves two problems: version control and collaboration.
I use it for all kinds of things.
As a rule of thumb, every time I create a file that I may some day want to edit,
I make sure it’s under Git control, even if it’s entirely local.
(Even my /etc is under Git control!
I use a tool called etckeeper to automatically create commits before and after updates
and to keep track of permissions.)
I’m writing this little post partly as a quick-start guide,
and partly as a means to convince the reader they ought to use Git everywhere too,
even if they’re not a software developer (I’m not).
Git can be a little daunting at first, but I promise it’s worth it. I’ve been preaching about Git since the mid-2010’s, and have had a nearly 100% success rate.
Other learning resources
Before I dive in, I want to point out that there are many other Git resources available online. A good place to start is Git’s homepage at https://git-scm.com/. Its learn page provides links to a very nice cheat sheet (which I just saw today for the first time) and the de-facto Git guide, the Pro Git Book (501 pages as of today). In terms of scope, my post here is approximately the geometric mean of these two resources.
History
Git was written in 2005 by Linus for the Linux kernel. I’ve used Linux every day (on many devices) since the 2.6.x era, yet I believe Git may be Linus’s best contribution to humanity.
Other VCSs
There are a handful of version control systems (VCSs) out there, some of which are okay.
A notable one is Apache Subversion (svn),
which uses a “checkout” system that can be nice for certain types of projects
(I’ve used it for large collaborative board layouts, for example).
In most cases, though, Git wins, no contest.
And for the cases where you do want svn,
or find yourself working with an upstream svn repo,
Git recently started providing a tool called git-svn,
so you don’t have to learn any new commands!
Git Repositories
A Git repository is a folder on your computer in which (some) files are tracked by Git.
Most of the files in the repository are just files
(some tracked, some untracked).
At the top level of the folder, however, exists a subfolder called .git.
(On Unix-like systems, the . prefix makes it “hidden”.
Windows usually has a separate flag for hiding directories.)
The .git folder contains all the Git-related business which we will discuss.
It is not itself tracked.
Don’t touch the contents of this folder unless you know what you’re doing :).
The git command
Since Git repositories are just folders of files with the special .git subfolder,
you don’t need anything special to posses a git repository.
But, to interact with it, you need a program called, you guessed it, git.
If I just type git in my terminal, I get the following:
$ gitusage: git [-v | --version] [-h | --help] [-C <path>] [-c <name>=<value>]
[--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
[-p | --paginate | -P | --no-pager] [--no-replace-objects] [--no-lazy-fetch]
[--no-optional-locks] [--no-advice] [--bare] [--git-dir=<path>]
[--work-tree=<path>] [--namespace=<name>] [--config-env=<name>=<envvar>]
<command> [<args>]
These are common Git commands used in various situations:
start a working area (see also: git help tutorial)
clone Clone a repository into a new directory
init Create an empty Git repository or reinitialize an existing one
work on the current change (see also: git help everyday)
add Add file contents to the index
mv Move or rename a file, a directory, or a symlink
restore Restore working tree files
rm Remove files from the working tree and from the index
examine the history and state (see also: git help revisions)
bisect Use binary search to find the commit that introduced a bug
diff Show changes between commits, commit and working tree, etc
grep Print lines matching a pattern
log Show commit logs
show Show various types of objects
status Show the working tree status
grow, mark and tweak your common history
branch List, create, or delete branches
commit Record changes to the repository
merge Join two or more development histories together
rebase Reapply commits on top of another base tip
reset Reset current HEAD to the specified state
switch Switch branches
tag Create, list, delete or verify a tag object signed with GPG
collaborate (see also: git help workflows)
fetch Download objects and refs from another repository
pull Fetch from and integrate with another repository or a local branch
push Update remote refs along with associated objects
'git help -a' and 'git help -g' list available subcommands and some
concept guides. See 'git help <command>' or 'git help <concept>'
to read about a specific subcommand or concept.
See 'git help git' for an overview of the system.The git command alone is pretty useless, as you can see.
Running it alone is the equivalent of running git -h,
which displays the brief help blurb above.
Some of the other flags can make it do useful things,
such as -v for version:
$ git -vgit version 2.47.3But generally, you use git by adding a Git command after it: git <command>.
Basic Concepts
I’ve found that diving into the commands is pretty confusing without talking about a few basic concepts first. If you disagree, skip ahead.
Commits
The basic building block of a Git repository is the commit.
A commit represents a snapshot of all the files you care about, and where they reside.
(It does this by pointing to a tree object, which itself points to blob objects.)
Note that Git does not keep track of folders,
but it will of course keep track of a file’s path (location in the tree)
which contains the folder information – it just can’t do empty folders.
(If you really want an empty folder for some reason, add a dummy file;
the GitLab UI for instance adds the empty file .gitkeep.)
In addition to this snapshot (the tree),
each commit also keeps track of a parent, which is another commit.
One exception is initial commits, which have no parent.
The other exception is merge commits, which have two parents.
Merge commits are bad and you should avoid them, though.
Note that while most commits have only one parent,
many commits can have the same parent – no problem at all!
This forms a “tree” of commits (not to be confused with the file tree object),
which is one of the things that makes Git beautiful.
The critical thing to get used to about commits is that they are cheap. They are made in milliseconds and generally take up very little space. So, there is basically no reason not to commit all the time! This is really powerful, because it means you can “save” any point in history, whenever you want, without much downside except taking the 5 seconds necessary to do so.
In addition to a tree and (usually) one parent,
commits also encode some metadata:
- Message
- Author Name
- Author Email
- Author Date
- Comitter Name
- Comitter Email
- Comitter Date
Messages explain what the commit does.
They can be a bit of a pain, and it is not uncommon for folks to just write wip (“work in progress”).
But, I think messages are really, really important – especially if you want to collaborate,
but also if you’re human and sometimes forget why you did things.
How you use your messages is ultimately up to you, but there are some cultural guidelines.
By convention, you should start your commit message with a single line of no more than 50 characters.
This line should describe what the commit does.
It’s best to write it in the imperative mood and in the present tense.
The idea is that when a commit is “applied”, the message is what happens.
For example: “add feature X” or “refactor feature Y”.
Or maybe even “reorganize some files”.
If there is more you want to say about the commit
(e.g. point to an open issue online documenting why you had to work around a bug),
the convention is to skip one line and add as much detail as you’d like.
The Author Name is the name of the human who originally created the core content of the commit. The Author Email and Author Date are the email of said human and the data the authoring occurred. The Comitter Name is the name of the human who created the actual commit. The Committer Email and Comitter Date are the email of said human and the data the committing occurred. When you first create a commit, the concepts of authoring and committing are the same, so the three pairs of fields will match. They may diverge later when we start playing around with commits.
Finally, a commit can (and should) be signed. I always uses GPG for this (there may be a way to do it with an SSH key as well, but I don’t know how). This is a way to prove to the world that the commit was created by a particular computer.
You can show the latest commit by using the git show -s command:
$ git show -scommit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
Author: Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu>
Date: Thu Oct 9 11:08:31 2025 -0400
add video supportThis is a commit I made to this repository to add video support. You can see the commit hash (see below for more details), the author (me) and the data I authored the commit. Finally, you can see the message, which in this case is only one line.
This representation is a summary of the commit, but there’s actually not a lot more to it!
To see the raw commit itself, run git show -s --pretty=raw
$ git show -s --pretty=rawcommit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
tree 64df90122b9a5a10521d5c0e00c75a5f59f19ec5
parent 3fe4b48adaa002b909485ee9d7f5ba92764ba45c
author Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1760022511 -0400
committer Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1766782597 -0500
gpgsig -----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbF9p0dwS2XUFAlr3L618ovLcjogUCaU72hQAKCRDL618ovLcj
oohlAQD00PccH/HXJWyzY8I8IxdLqCec7L0ykCCxFKEUevyHlAD/VzsilO2gr9fm
yqL6eukKIqgoNzdXxVju/9yrfLxdRwk=
=radA
-----END PGP SIGNATURE-----
add video supportNow you can see the tree, which points to the “snapshot”,
and the parent, which is another commit.
Also, you can see the committer field (which is still me, but note that it is a different date).
Finally, you have my GPG signature.
And that’s it!
Hashes
Git represents each commit through a “hash”.
The numbers and letter following the word commit above are the hash the uniquely identifies that commit.
I can use that hash to refer to that commit whenever I want to.
For instance,
if I wanted to run the above command at some other point in history when this commit wasn’t the “latest” one:
$ git show -s --pretty=raw 4fb9d6b18acf00f7261fc20206de5adafd1f3977commit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
tree 64df90122b9a5a10521d5c0e00c75a5f59f19ec5
parent 3fe4b48adaa002b909485ee9d7f5ba92764ba45c
author Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1760022511 -0400
committer Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1766782597 -0500
gpgsig -----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbF9p0dwS2XUFAlr3L618ovLcjogUCaU72hQAKCRDL618ovLcj
oohlAQD00PccH/HXJWyzY8I8IxdLqCec7L0ykCCxFKEUevyHlAD/VzsilO2gr9fm
yqL6eukKIqgoNzdXxVju/9yrfLxdRwk=
=radA
-----END PGP SIGNATURE-----
add video supportYou actually don’t even need to write out the whole hash.
You just need enough characters to uniquely identify it.
In this case, it just so happens that no other hashes Git knows about right now start with 4fb9, so:
$ git show -s --pretty=raw 4fb9commit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
tree 64df90122b9a5a10521d5c0e00c75a5f59f19ec5
parent 3fe4b48adaa002b909485ee9d7f5ba92764ba45c
author Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1760022511 -0400
committer Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1766782597 -0500
gpgsig -----BEGIN PGP SIGNATURE-----
iHUEABYKAB0WIQSbF9p0dwS2XUFAlr3L618ovLcjogUCaU72hQAKCRDL618ovLcj
oohlAQD00PccH/HXJWyzY8I8IxdLqCec7L0ykCCxFKEUevyHlAD/VzsilO2gr9fm
yqL6eukKIqgoNzdXxVju/9yrfLxdRwk=
=radA
-----END PGP SIGNATURE-----
add video supportIf you don’t provide enough to uniquely identify one commit, Git tells you it’s ambiguous:
$ git show -s --pretty=raw 4fbfatal: ambiguous argument '4fb': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'Knowing about these hashes is a great safety blanket, because no matter how badly you “mess up” your repo,
you can rest assured that you can always go back to any point in history –
so long as you know the commit hash!
(And if you don’t, there’s a great little feature called the reflog to help you, but more on that later).
By the way, if you look back at the content of the commit – the parent is referred to by its hash, which makes sense. The tree is actually referred to by a hash as well! Same idea there. Cool, right?
How they’re calculated
This bit is not so important, but I think it’s kind of cool.
Firstly, a hash is a fixed-length encoding of any number of bytes. Hashing algorithms work in a way where small changes to the input create huge changes to the output, and it’s extremely unlikely two pieces of content ever produce the same hash. Hashes are not a Git concept; they are used in many places, particularly cryptography. I’m not qualified to say much more here, but if the concept is new to you, it’s a good rabbit hole.
How commits hashes are calculated is actually super simple.
As mentioned above, the entire commit is in what is shown by git show -s --pretty=raw.
If you remove the first line, which is the hash of the commit, you get the commit exactly.
Now, just prepend commit <size>\0, replacing <size> with the number of bytes following,
and run SHA-1 on the whole thing.
And that’s it! That’s your hash.
So, if the tree it points to, or the parent, or the author or committer, or message change,
the commit hash changes.
So, you can rest assured that your hash encodes all of those things.
On the flip side, if nothing changes, the hash doesn’t change either.
Diffs
A diff is the difference between two commits.
diffs can naturally concern one or many files.
They can include file additions, file deletions, and file modifications
(Git will also report “renaming” if an added file is similar to a deleted file).
Even though commits store a snapshot (i.e. point to a tree),
they are often talked about as a diff with respect to their parent
(hence the “apply” concept in the discussion of messages above).
As a matter of fact, if I don’t include the -s flag in the git show command,
I see the diff with respect to the parent by default:
$ git showcommit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
Author: Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu>
Date: Thu Oct 9 11:08:31 2025 -0400
add video support
diff --git a/.gitmodules b/.gitmodules
index 5c05f90..9e711e3 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,3 +1,6 @@
[submodule "themes/hugo-theme-relearn"]
path = themes/hugo-theme-relearn
url = https://github.com/McShelby/hugo-theme-relearn.git
+[submodule "themes/hugo-video"]
+ path = themes/hugo-video
+ url = https://github.com/martignoni/hugo-video.git
diff --git a/hugo.toml b/hugo.toml
index d9afd74..167aaf8 100644
--- a/hugo.toml
+++ b/hugo.toml
@@ -1,7 +1,7 @@
baseURL = 'https://example.org/' # I plan to use GitLab's CI/CD, but not sure what this will end up being
languageCode = 'en-us'
title = "Dimitar's CBA Blog"
-theme = 'hugo-theme-relearn'
+theme = ['hugo-video', 'hugo-theme-relearn']
enableGitInfo = true
[params]
diff --git a/themes/hugo-video b/themes/hugo-video
new file mode 160000
index 0000000..bc3ef03
--- /dev/null
+++ b/themes/hugo-video
@@ -0,0 +1 @@
+Subproject commit bc3ef03fe9c7a33616f1612923cf124681e61ce2
This is perhaps not a great commit for this example, since there’s a lot going on.
But you can see that I modified the .gitmodules file by adding three lines (they start with +).
I also modified hugo.toml by replacing one line.
Finally, I created a new file, themes/hugo-video
(which happens to be a submodule; more on that later).
Let’s look at an easier example.
This time, instead of getting the diff implicitly inside git show of a commit,
I will diff two commits directly:
$ git diff 4ca660e 7787845diff --git a/content/tools/_index.md b/content/tools/_index.md
new file mode 100644
index 0000000..c4e8a29
--- /dev/null
+++ b/content/tools/_index.md
@@ -0,0 +1,9 @@
++++
+title = "Tools"
+description = "My favorite tools"
+weight = 3
++++
+
+This section contains documentations for tools that I love.
+It is written for nobody in particular,
+but may be useful to somebody.
diff --git a/hugo.toml b/hugo.toml
index 8290a0a..6f7f89d 100644
--- a/hugo.toml
+++ b/hugo.toml
@@ -2,3 +2,6 @@ baseURL = 'https://example.org/' # I plan to use GitLab's CI/CD, but not sure w
languageCode = 'en-us'
title = "Dimitar's CBA Blog"
theme = 'hugo-theme-relearn'
+
+[params]
+themeVariant = 'zen-dark'
The difference between those two points in history is that I added some lines to hugo.toml
and added the new file content/tools/_index.md b/content/tools/_index.md.
If I flip the order, the difference will be the opposite:
$ git diff 7787845 4ca660ediff --git a/content/tools/_index.md b/content/tools/_index.md
deleted file mode 100644
index c4e8a29..0000000
--- a/content/tools/_index.md
+++ /dev/null
@@ -1,9 +0,0 @@
-+++
-title = "Tools"
-description = "My favorite tools"
-weight = 3
-+++
-
-This section contains documentations for tools that I love.
-It is written for nobody in particular,
-but may be useful to somebody.
diff --git a/hugo.toml b/hugo.toml
index 6f7f89d..8290a0a 100644
--- a/hugo.toml
+++ b/hugo.toml
@@ -2,6 +2,3 @@ baseURL = 'https://example.org/' # I plan to use GitLab's CI/CD, but not sure w
languageCode = 'en-us'
title = "Dimitar's CBA Blog"
theme = 'hugo-theme-relearn'
-
-[params]
-themeVariant = 'zen-dark'
diffs are important because they can also be used as “patches”.
Patches are a concept that pre-dates Git, and is a hack-y way of distributing changes to a codebase.
When you git apply a commit, you’re actually applying the patch that results from its inherent diff.
Remotes
Remotes are what enables collaboration.
Remotes are simply another repository,
on someone else’s computer (or even on your own computer, in another location).
Generally, the idea is that they represent the same codebase,
but technically speaking that’s not even a requirement.
The way you interact with remotes is you can get commits from them (“fetching” or “pulling”),
or you can give them commits (“pushing”).
Then, you can use those commits as parents for new commits.
It should be clear from the discussion above that there’s no risk of overwriting anything.
You can think of commits just existing in a big bucket,
with references to each other through the parent field.
You can add remotes to your repository using git remote add.
Each remote has to have a unique name from your repository’s perspective.
Git allows communication with your remotes though many protocols. The best one to use is SSH. HTTPS is also common. HTTP shouldn’t be used. For remotes on your local computer, Git just uses your filesystem. There are others, too.
Nothing about Git mandates any sort of organization between remotes,
but generally speaking, most people use a “hub and spoke approach”.
You have one central repo, colloquially known as origin,
and everyone only fetches from and pushes to origin.
This way, origin acts as a sort of “source of truth”.
Hosting Services
A number of services exist for hosting Git repos to play the role of origin.
GitHub is a common one.
Others include BitBucket and SourceForge.
We use GitLab,
because they let you run a server locally (for free).
Note that in addition to hosting a repo, these services usually have many other features. The most important one for collaboration is the Merge Request (a.k.a. Pull Request) feature.
References
Obviously, keeping track of a million hashes is not a convenient system. Most of the time, we don’t keep track of the hashes themselves, but use “references”.
One special reference is HEAD.
This is the “latest” commit.
When we ran git show above, that was the same as running git show HEAD.
When you create a new commit, the current HEAD becomes its parent,
and the new commit becomes the new HEAD.
Easy!
If you want to talk about HEAD’s parent, you can use HEAD^, HEAD~ or HEAD~1,
if you want to talk about its parent, use can use HEAD^^, HEAD~~ or HEAD~2,
and so on.
As a matter of fact, you can stick these suffixes to bare hashes or to any of the other references below!
Another type of reference is the “tag”. Tags are like a nickname for a commit. The idea is that once you tag a commit, that tag will remain with that commit forever (you can change them of course, but you shouldn’t). You can pull and push tags to make sure everyone is on the same page about them. They are great for releases, where you want to know exactly what the state of the world was when you, for example, build and distributed a piece of software, or submitted a draft of a paper for review.
Perhaps the most common type of reference is the “branch”.
Like a tag, that’s a name referring to a commit.
But unlike a tag, it’s not meant to forever refer to one particular commit.
Branches are meant to track progress.
By convention, the main branch tracks the latest stable version of a codebase,
but this is something that varies from organization to organization and even from repo to repo.
One nice thing about branches is that you can be “on” a branch. When you’re “on” a branch, two things happen:
- Your
HEADis whatever the branch is pointing to - When you commit, not only does
HEADpoint to the new commit, but now your branch points to it too!
Working on a branch is the most common way to work.
You can also not do that (called a detached HEAD state),
but then you’ll have a hard time keeping track of any new commits you make.
That’s great for looking around, but beware losing your commits
(when I say losing here, I just mean losing track of the commit hash; nothing is ever lost in Git,
and even the hash is not that hard to track down using the reflog as I mentioned before).
Unlike tags, your local branches may not match the branches on remotes.
When you fetch from a remote, you can see which commits their branches are pointing to.
As one common example, you may have a copy of main,
which will just be referred to as main,
and origin may have a copy of main,
which will be referred to as origin/main.
One final note about branches, which is often a point of confusion.
You can be “tracking” a remote branch.
This is basically an optional field for every local branch you have.
For now, it doesn’t mean a whole lot,
except that you get nice messages when things diverge.
For instance, when I run git status:
$ git statusOn branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
(use "git pull" if you want to integrate the remote branch with yours)Oh boy, I need to clean that up!
Later, we will see that it’s also convenient for the git push and git pull commands.
Configuration
Git uses a key-value configuration system.
Configurations can be system-wide.
Those are stored in /etc/gitconfig.
They can also be user-specific,
which takes precedence over system-wide configurations.
Those are stored in ~/.gitconfig.
Finally, they can be per-repo, which takes precedence over user-specific configurations.
Those are stored in .git/config.
You can find things like branch and remote settings in that last one.
One particularly neat configuration is the alias.
You can create custom commands that map to other git commands or even to non-git command!
(I track my .gitconfig here.)
Basic Commands
Now let’s talk about some common commands.
Git has two sets of commands.
“Plumbing” commands
are stable, ancient, and annoying.
Use them in scripts.
“Porcelain” commands
are more user-friendly, but may change slightly from version to version.
Use them on the CLI.
You can get the full manuals for all of the commands below and more using git help <command>
or online at https://git-scm.com/docs.
Okay, enough philosophizing. How do you actually Git?
git init
You won’t use this command often, but it’s a good place to start.
This is how you make a new repo.
It creates a .git subdirectory in your current directory,
and that directory becomes the root of your repo.
Any files in there start as untracked,
and you don’t have a HEAD – so your first commit will be a parent-less initial commit.
git clone
More often, you will “clone” a directory from somewhere else.
Add the reference after clone.
For example, to clone this repo:
$ git clone ssh://git@gitlab.cba.mit.edu:846/dimitar/blog.gitCloning into 'blog'...
remote: Enumerating objects: 321, done.
remote: Counting objects: 100% (41/41), done.
remote: Compressing objects: 100% (40/40), done.
remote: Total 321 (delta 9), reused 0 (delta 0), pack-reused 280 (from 1)
Receiving objects: 100% (321/321), 5.31 MiB | 12.92 MiB/s, done.
Resolving deltas: 100% (90/90), done.git clone will initialize your local repo,
automatically create a remote called origin with that reference,
and then fetch from it.
It will also create a main branch for you tracking origin/main,
and it will start you up to date with it.
git status
This is probably the command you will use the most.
$ git statusOn branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
(use "git pull" if you want to integrate the remote branch with yours)
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: _index.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
_index.md.bak
no changes added to commit (use "git add" and/or "git commit -a")We see a few things here.
First, we are told we are on the main branch.
(If we were in a detached HEAD state, it would tell us here.)
Because main is set up to track origin/main in .git/config,
we are given some helpful information on how that’s going.
Next, we get a list of files that Git knows about that we have modified with respect to HEAD.
The, we get a list of files that Git doesn’t know about.
Git doesn’t touch those.
Technically, there is a third category of files. Those are “ignored” files, and you can get a list of them as well:
$ git status --ignoredOn branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
(use "git pull" if you want to integrate the remote branch with yours)
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: _index.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
_index.md.bak
Ignored files:
(use "git add -f <file>..." to include in what will be committed)
../../../.hugo_build.lock
._index.md.swp
../../../public/
no changes added to commit (use "git add" and/or "git commit -a")The way you get files to be ignored is by dropping a .gitignore file in the base directory,
listing what file patterns you want ignored.
This is useful for things like swapfiles and build artifacts.
There are pre-populated .gitignore files on the internet available for various languages
(for example, from GitHub).
git add
git add is used to “stage” a file for committing.
You can stage any file in the Changes not staged for commit list,
or you can stage a file from Untracked files.
You can also stage ignored files if you really want to with the -f flag, but that’s rare.
You can even stage parts of a file with the -p flag!
For example, if I run git add _index.md,
it moves _index.md to a fourth category called Changes to be committed:
$ git add _index.md
$ git statusOn branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
(use "git pull" if you want to integrate the remote branch with yours)
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: _index.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
_index.md.bak
no changes added to commit (use "git add" and/or "git commit -a")And if I keep making changes to _index.md before committing:
$ git statusOn branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
(use "git pull" if you want to integrate the remote branch with yours)
Changes to be committed:
(use "git restore --staged <file>..." to unstage)
modified: _index.md
Changes not staged for commit:
(use "git add <file>..." to update what will be committed)
(use "git restore <file>..." to discard changes in working directory)
modified: _index.md
Untracked files:
(use "git add <file>..." to include in what will be committed)
_index.md.bak
no changes added to commit (use "git add" and/or "git commit -a")Staging is one of the concepts that feels like a gotcha at first, but without it, you’d be stuck jumping through hoops when you want to create a commit that doesn’t include everything in your directory.
git commit
git commit creates a commit using your staged files.
Any modifications that are not staged are not committed.
Also, any untracked files that are not staged are not committed.
When you run git commit just like that, an editor opens up for you to write your commit message.
You have to “save and quit” the document once you write your message.
(If your editor is vi or vim,
you can edit with the i key, then stop editing with Esc and save-and-quit with ZZ.
You can specify your editor with the core.editor configuration key.)
If you leave the message blank, it will cancel the commit.
If you don’t want to deal with the editor and only intend to do a one-line commit message,
you can add the message to the command with the -m flag:
git commit -m "make changes"Then, the editor won’t open.
Another great shortcut is -a:
git commit -aThis automatically adds everything in the Changes not staged for commit for commit list,
but not anything from the Untracked files or Ignored files lists.
You can use both:
git commit -a -m "make changes"Or if you really want to save keystrokes:
git commit -am "make changes"Warning – you will get in the habit of doing -am, and forget to add new files when you makes them!
Run git status often to get a lay of the land.
git diff
We already saw git diff above.
Used like that, it can produce a diff between two commits.
Using it with a single commit, git diff <commit>, is the same as saying git diff <commit> HEAD.
Using it without any arguments does something slightly different.
It shows you what changes you currently have in flight that are not staged for commit.
Finally, git diff --cached will, confusingly enough, show you what you have staged.
git show
This is another one we have seen. It shows you a commit. I don’t have more to add than what I wrote in the concepts section.
git branch
git branch will list the local branches.
git branch -r will list the remote branches.
git branch -a will list both.
git branch -c will create a new branch pointing to HEAD.
git branch -d will delete the current branch
(just the branch “name” – the commits never go anywhere).
git branch -d <name> will delete branch <name>.
git branch -D <name> will delete branch <name> even if Git thinks that’s a bad idea.
git branch -m <new-name> renames the current branch to <new-name>.
git branch -m <old-name> <new-name> renames branch <old-name> to <new-name>.
git switch
This command will move you to a different branch than on one you’re on. If you want to create a branch and move to it you can do:
$ git branch foo
$ git switch fooor as a shortcut
$ git switch -c fooYou can also go to a specific reference in a “detached HEAD” state:
$ git swictch -d <ref>(see discussion of References above).
git log
Running git log is like running git show -s, then running it for the parent,
then for its parent, and so on until the initial commit
(like many git commands, it gets wrapped in less, so it won’t blow up your terminal).
git log is not terribly useful bare, but it has some fun flags.
I like to run it like this:
$ git log --oneline --decorate --graph --allExcept that’s a lie. I have an alias for the above in my .gitconfig called graphall, so I just run
$ git graphallTry it, you’ll love it.
git fetch
This updates the references to remote branches,
and downloads any commits, trees and blobs it needs to make that happen.
By default, I believe it will only try to update the reference to the remote branch
which your current branch is tracking, and default to origin if that’s not a thing.
I usually run it with the --all though to get everything, because why not.
Note that since this only updates remote references,
it doesn’t change anything about the current state or about your local branches.
Also, it doesn’t remove references to remote branches which no longer exist on the remote.
If you want to get rid of them, add -p.
git pull
This command first runs git fetch for the branch you’re tracking.
Then it tries to update your branch.
If the remote is “ahead” (meaning, the commit your local branch is currently on is part of the history
of the commit that the remote branch is currently on)
it will simply move your branch to the remote reference and you’re up to date!
If the remote is “behind” (meaning, the commit the remote branch is currently on is part of the history
of the commit that your local branch is currently on)
it won’t do anything.
If neither is true, the branches have “diverged”.
This means that you made some commits locally and someone else made some commits on the remote branch.
At this point, git pull will either git merge or git rebase.
Merging is the absolute worst, but unfortunately I think that’s still git pull’s default behavior.
To use rebase instead, add --rebase or, even better, --rebase=interactive.
(The docs call rebasing dangerous because it re-writes history,
but you’re going to be rewritting history which hasn’t been pushed, so it’s okay in my book.)
More on rebasing later.
This is the single biggest headache with git!
(It comes from the fact that git gives you the freedom to do this thing which you really shouldn’t do).
I’m going to write a section below that shows you how to always avoid this situation.
If you do find yourself in this situation, though, just remember:
as long as you’ve committed something in git, it will never be lost.
Just make sure you commit before you pull
(git might not even let you pull if you have uncommitted changes – I don’t know, I’ve never tried.)
git push
This command first runs git fetch for the branch you’re tracking.
Then it tries to update the remote branch.
If you’re “ahead” of the remote, it will move its branch reference to your branch reference
and upload anything required to do so.
Then, it will do git fetch again to make sure your remote reference matches the remote’s new branch reference.
If the remote is “behind” or if the branches have “diverged”, it will error out.
git push doesn’t try to do any merging or rebasing like git pull does.
You can run git push -f which will just force the remote to match you no matter what.
To do this, you have to enable force-pushes on the remote side first.
git checkout
If you’ve read other git documentation in the past, you’ll notice that I haven’t mentioned git checkout yet.
I don’t plan to.
git checkout is bad because it does a million unrelated things with very similar-looking syntax.
There is nothing you need git checkout for that you can’t do with other commands
(like git switch and git restore, which
were introduced to replace git checkout
).
That’s it for now! I will add a section of more commands I think are helpful. Some honorable mentions:
git apply
git blame
git cat-file
git cherry-pick
git clean
git config
git fsck
git gc
git mv
git rebase
git reflog
git remote
git reset
git restore
git revert
git rm
git stash
git submodule
git tag
git worktreeHow to avoid dealing with diverged branches
As mentioned above, diverged branches can be a pain. There are two workflows that can side-step the problem completely.
Workflow 1
This workflow is simple:
- Do you work on
main - When you feel good about your work, push to
origin/main
This is the worse of the two workflows, but I want to recognize that sometimes it wins on simplicity. Only do this if you are working alone and never work in parallel.
Also, don’t do any history rewritting past origin/main.
If you really need to edit something that’s already been pushed,
enable force-pushing
(e.g. in GitLab Settings > Respository > Protected branches > main > Allowed to foce push),
push with git push -f, then immediately disable the option again.
If you need to switch computers or work on a remote server,
make sure main on your first computer is up to date with origin/main (e.g. git push)
then make sure main on your second computer is up to date with origin/main (e.g. git pull),
then only work on the second computer until it’s time to switch back.
This is convenient for small projects, but quickly becomes untenable for larger projects and is always a bad idea if other folks are involved (even if they’re just reading your work – if you end up force-pushing, they may not be able to find your work anymore).
Workflow 2
Never work on main.
When you start working on a feature, start on main and git pull from origin/main.
Then, make a new branch.
I find it nice to preface the branch name with something that identifies you
(right now, I’m on a branch called dsd36/git, because dsd36 is my Kerberos; the / is cosmetic only).
As a reminder, you can make a new branch and switch to it using git switch -c dsd36/git.
Then, you can git push to origin at will.
The first time you try to run git push on your new branch, you’ll see something like this:
$ git pushfatal: The current branch dsd36/git has no upstream branch.
To push the current branch and set the remote as upstream, use
git push --set-upstream origin dsd36/git
To have this happen automatically for branches without a tracking
upstream, see 'push.autoSetupRemote' in 'git help config'.You see this because your new branch is not tracking anything yet,
and there is no branch named dsd36/git on origin,
This is a precaution (that can be disabled).
Simply run the command that git helpfully suggested the first time: git push --set-upstream origin dsd36/git.
As you work, feel free to git push -f to your branch if you rewrite history locally.
Note that GitLab and GitHub only protect main from force pushes by default
(I know, I didn’t invent this workflow :) ).
If rewritting history is not your cup of tea, then you don’t have to worry about it
because nobody else should be touching your branch.
If others are working in parallel, they should be working on their own branches.
When you feel good about your feature, it’s time to get it into main.
Follow these steps:
- Make sure there are no uncommitted changes on your branch (
git commit -am "make changes") - Switch to
main(git switch main) - Pull the latest
main(git pull main). Note that because you’re not working onmain, and nobody is force-pushing tomain, you won’t have divergent branches. - Switch back to your branch (
git switch dsd36/git) - Rebase onto
main. I haven’t talked about rebasing yet, but here are the basic steps:- Run
git rebase -i main - Save-and-quit the file that shows up using e.g.
ZZif you’re invim - Wait for it to run. For most commits,
gitwill be able to auto-merge - If you get any conflicts (you and someone else edited the same line),
then for each conflicting file:
- Open the file
- Look for the conflicts (marked with
<<<<<<<=======>>>>>>>) - Fix the conflicts
- Close the file
git add <the file>
- If you had to fix conflicts, continue with
git rebase --continue - If instead you panic,
git rebase --abort, take a break, try again
- Run
- Push your branch (
git push). If your rebase wasn’t a no-op (i.e.mainchanges since you originally branched from it), and you had previously pushed, then you will have to do the dreadedgit push -fhere. - Go to GitLab and create a Merge Request (or go to GitHub and create a Pull Request; same thing, worse name)
- Merge your change into
mainin the UI- Bonus points for rebasing it onto
main, which is something you can set up in the Repository Settings - The merge here won’t be painful
because your branch is going to be ahead of
mainthanks to the rebase you already did
- Bonus points for rebasing it onto
Note that if there are real merge conflicts, you still end up having to solve them, no way around that.
But, you do it on your branch, in the commits that create the issue as they come up, which is a lot more logical
and much easier than solving everything in a hairy Merge commit.
If you follow this workflow,
there is actually another setting in the Repository Settings that completely disables pushing to main.
Better yet, there are a million CI/CD features you can take advantage of using this workflow.
For instance, you can set up Merge Request / Pull Request approvals.
This can be an informal thing with your team, if you just want eyes on your work,
or you can use the Repository Settings to enforce an approval before a request is merged.
Some repositories my require one or two approvals by anyone on the team.
Other repositories may have a specific list of people who have to approve a request before merging
(called “Code Owners”).
The UI makes it easy to review changes.
So easy in fact, I use it to review my own code!
You can also set up pipelines that need to pass before a merge is possible
(e.g. a set of tests to make sure that main doesn’t break after the merge).
Or, you can set up actions that happen after the merge (e.g. publishing a website).
The possibilities are endless.
All this may seem like a lot, but the whole process can take less than 30 seconds once you do it a few times, and I guarantee that it will save time in the long run.