Git

TL;DR

git clone <url>
git add .
git commit -m "make changes"
git pull
git push

Introduction

Git is truly fantastic software, and potentially my all-time favorite tool. It simultaneously solves two problems: version control and collaboration. I use it for all kinds of things. As a rule of thumb, every time I create a file that I may some day want to edit, I make sure it’s under Git control, even if it’s entirely local. (Even my /etc is under Git control! I use a tool called etckeeper to automatically create commits before and after updates and to keep track of permissions.) I’m writing this little post partly as a quick-start guide, and partly as a means to convince the reader they ought to use Git everywhere too, even if they’re not a software developer (I’m not).

Git can be a little daunting at first, but I promise it’s worth it. I’ve been preaching about Git since the mid-2010’s, and have had a nearly 100% success rate.

Other learning resources

Before I dive in, I want to point out that there are many other Git resources available online. A good place to start is Git’s homepage at https://git-scm.com/. Its learn page provides links to a very nice cheat sheet (which I just saw today for the first time) and the de-facto Git guide, the Pro Git Book (501 pages as of today). In terms of scope, my post here is approximately the geometric mean of these two resources.

History

Git was written in 2005 by Linus for the Linux kernel. I’ve used Linux every day (on many devices) since the 2.6.x era, yet I believe Git may be Linus’s best contribution to humanity.

Other VCSs

There are a handful of version control systems (VCSs) out there, some of which are okay. A notable one is Apache Subversion (svn), which uses a “checkout” system that can be nice for certain types of projects (I’ve used it for large collaborative board layouts, for example). In most cases, though, Git wins, no contest. And for the cases where you do want svn, or find yourself working with an upstream svn repo, Git recently started providing a tool called git-svn, so you don’t have to learn any new commands!

Git Repositories

A Git repository is a folder on your computer in which (some) files are tracked by Git. Most of the files in the repository are just files (some tracked, some untracked). At the top level of the folder, however, exists a subfolder called .git. (On Unix-like systems, the . prefix makes it “hidden”. Windows usually has a separate flag for hiding directories.) The .git folder contains all the Git-related business which we will discuss. It is not itself tracked. Don’t touch the contents of this folder unless you know what you’re doing :).

The git command

Since Git repositories are just folders of files with the special .git subfolder, you don’t need anything special to posses a git repository. But, to interact with it, you need a program called, you guessed it, git.

If I just type git in my terminal, I get the following:

$ git
usage: git [-v | --version] [-h | --help] [-C <path>] [-c <name>=<value>]
           [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
           [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--no-lazy-fetch]
           [--no-optional-locks] [--no-advice] [--bare] [--git-dir=<path>]
           [--work-tree=<path>] [--namespace=<name>] [--config-env=<name>=<envvar>]
           <command> [<args>]

These are common Git commands used in various situations:

start a working area (see also: git help tutorial)
   clone     Clone a repository into a new directory
   init      Create an empty Git repository or reinitialize an existing one

work on the current change (see also: git help everyday)
   add       Add file contents to the index
   mv        Move or rename a file, a directory, or a symlink
   restore   Restore working tree files
   rm        Remove files from the working tree and from the index

examine the history and state (see also: git help revisions)
   bisect    Use binary search to find the commit that introduced a bug
   diff      Show changes between commits, commit and working tree, etc
   grep      Print lines matching a pattern
   log       Show commit logs
   show      Show various types of objects
   status    Show the working tree status

grow, mark and tweak your common history
   branch    List, create, or delete branches
   commit    Record changes to the repository
   merge     Join two or more development histories together
   rebase    Reapply commits on top of another base tip
   reset     Reset current HEAD to the specified state
   switch    Switch branches
   tag       Create, list, delete or verify a tag object signed with GPG

collaborate (see also: git help workflows)
   fetch     Download objects and refs from another repository
   pull      Fetch from and integrate with another repository or a local branch
   push      Update remote refs along with associated objects

'git help -a' and 'git help -g' list available subcommands and some
concept guides. See 'git help <command>' or 'git help <concept>'
to read about a specific subcommand or concept.
See 'git help git' for an overview of the system.

The git command alone is pretty useless, as you can see. Running it alone is the equivalent of running git -h, which displays the brief help blurb above. Some of the other flags can make it do useful things, such as -v for version:

$ git -v
git version 2.47.3

But generally, you use git by adding a Git command after it: git <command>.

Basic Concepts

I’ve found that diving into the commands is pretty confusing without talking about a few basic concepts first. If you disagree, skip ahead.

Commits

The basic building block of a Git repository is the commit. A commit represents a snapshot of all the files you care about, and where they reside. (It does this by pointing to a tree object, which itself points to blob objects.) Note that Git does not keep track of folders, but it will of course keep track of a file’s path (location in the tree) which contains the folder information – it just can’t do empty folders. (If you really want an empty folder for some reason, add a dummy file; the GitLab UI for instance adds the empty file .gitkeep.)

In addition to this snapshot (the tree), each commit also keeps track of a parent, which is another commit. One exception is initial commits, which have no parent. The other exception is merge commits, which have two parents. Merge commits are bad and you should avoid them, though. Note that while most commits have only one parent, many commits can have the same parent – no problem at all! This forms a “tree” of commits (not to be confused with the file tree object), which is one of the things that makes Git beautiful.

The critical thing to get used to about commits is that they are cheap. They are made in milliseconds and generally take up very little space. So, there is basically no reason not to commit all the time! This is really powerful, because it means you can “save” any point in history, whenever you want, without much downside except taking the 5 seconds necessary to do so.

In addition to a tree and (usually) one parent, commits also encode some metadata:

  • Message
  • Author Name
  • Author Email
  • Author Date
  • Comitter Name
  • Comitter Email
  • Comitter Date

Messages explain what the commit does. They can be a bit of a pain, and it is not uncommon for folks to just write wip (“work in progress”). But, I think messages are really, really important – especially if you want to collaborate, but also if you’re human and sometimes forget why you did things. How you use your messages is ultimately up to you, but there are some cultural guidelines. By convention, you should start your commit message with a single line of no more than 50 characters. This line should describe what the commit does. It’s best to write it in the imperative mood and in the present tense. The idea is that when a commit is “applied”, the message is what happens. For example: “add feature X” or “refactor feature Y”. Or maybe even “reorganize some files”. If there is more you want to say about the commit (e.g. point to an open issue online documenting why you had to work around a bug), the convention is to skip one line and add as much detail as you’d like.

The Author Name is the name of the human who originally created the core content of the commit. The Author Email and Author Date are the email of said human and the data the authoring occurred. The Comitter Name is the name of the human who created the actual commit. The Committer Email and Comitter Date are the email of said human and the data the committing occurred. When you first create a commit, the concepts of authoring and committing are the same, so the three pairs of fields will match. They may diverge later when we start playing around with commits.

Finally, a commit can (and should) be signed. I always uses GPG for this (there may be a way to do it with an SSH key as well, but I don’t know how). This is a way to prove to the world that the commit was created by a particular computer.

You can show the latest commit by using the git show -s command:

$ git show -s
commit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
Author: Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu>
Date:   Thu Oct 9 11:08:31 2025 -0400

    add video support

This is a commit I made to this repository to add video support. You can see the commit hash (see below for more details), the author (me) and the data I authored the commit. Finally, you can see the message, which in this case is only one line.

This representation is a summary of the commit, but there’s actually not a lot more to it! To see the raw commit itself, run git show -s --pretty=raw

$ git show -s --pretty=raw
commit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
tree 64df90122b9a5a10521d5c0e00c75a5f59f19ec5
parent 3fe4b48adaa002b909485ee9d7f5ba92764ba45c
author Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1760022511 -0400
committer Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1766782597 -0500
gpgsig -----BEGIN PGP SIGNATURE-----

 iHUEABYKAB0WIQSbF9p0dwS2XUFAlr3L618ovLcjogUCaU72hQAKCRDL618ovLcj
 oohlAQD00PccH/HXJWyzY8I8IxdLqCec7L0ykCCxFKEUevyHlAD/VzsilO2gr9fm
 yqL6eukKIqgoNzdXxVju/9yrfLxdRwk=
 =radA
 -----END PGP SIGNATURE-----

    add video support

Now you can see the tree, which points to the “snapshot”, and the parent, which is another commit. Also, you can see the committer field (which is still me, but note that it is a different date). Finally, you have my GPG signature. And that’s it!

Hashes

Git represents each commit through a “hash”. The numbers and letter following the word commit above are the hash the uniquely identifies that commit. I can use that hash to refer to that commit whenever I want to. For instance, if I wanted to run the above command at some other point in history when this commit wasn’t the “latest” one:

$ git show -s --pretty=raw 4fb9d6b18acf00f7261fc20206de5adafd1f3977
commit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
tree 64df90122b9a5a10521d5c0e00c75a5f59f19ec5
parent 3fe4b48adaa002b909485ee9d7f5ba92764ba45c
author Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1760022511 -0400
committer Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1766782597 -0500
gpgsig -----BEGIN PGP SIGNATURE-----

 iHUEABYKAB0WIQSbF9p0dwS2XUFAlr3L618ovLcjogUCaU72hQAKCRDL618ovLcj
 oohlAQD00PccH/HXJWyzY8I8IxdLqCec7L0ykCCxFKEUevyHlAD/VzsilO2gr9fm
 yqL6eukKIqgoNzdXxVju/9yrfLxdRwk=
 =radA
 -----END PGP SIGNATURE-----

    add video support

You actually don’t even need to write out the whole hash. You just need enough characters to uniquely identify it. In this case, it just so happens that no other hashes Git knows about right now start with 4fb9, so:

$ git show -s --pretty=raw 4fb9
commit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
tree 64df90122b9a5a10521d5c0e00c75a5f59f19ec5
parent 3fe4b48adaa002b909485ee9d7f5ba92764ba45c
author Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1760022511 -0400
committer Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu> 1766782597 -0500
gpgsig -----BEGIN PGP SIGNATURE-----

 iHUEABYKAB0WIQSbF9p0dwS2XUFAlr3L618ovLcjogUCaU72hQAKCRDL618ovLcj
 oohlAQD00PccH/HXJWyzY8I8IxdLqCec7L0ykCCxFKEUevyHlAD/VzsilO2gr9fm
 yqL6eukKIqgoNzdXxVju/9yrfLxdRwk=
 =radA
 -----END PGP SIGNATURE-----

    add video support

If you don’t provide enough to uniquely identify one commit, Git tells you it’s ambiguous:

$ git show -s --pretty=raw 4fb
fatal: ambiguous argument '4fb': unknown revision or path not in the working tree.
Use '--' to separate paths from revisions, like this:
'git <command> [<revision>...] -- [<file>...]'

Knowing about these hashes is a great safety blanket, because no matter how badly you “mess up” your repo, you can rest assured that you can always go back to any point in history – so long as you know the commit hash! (And if you don’t, there’s a great little feature called the reflog to help you, but more on that later).

By the way, if you look back at the content of the commit – the parent is referred to by its hash, which makes sense. The tree is actually referred to by a hash as well! Same idea there. Cool, right?

How they’re calculated

This bit is not so important, but I think it’s kind of cool.

Firstly, a hash is a fixed-length encoding of any number of bytes. Hashing algorithms work in a way where small changes to the input create huge changes to the output, and it’s extremely unlikely two pieces of content ever produce the same hash. Hashes are not a Git concept; they are used in many places, particularly cryptography. I’m not qualified to say much more here, but if the concept is new to you, it’s a good rabbit hole.

How commits hashes are calculated is actually super simple. As mentioned above, the entire commit is in what is shown by git show -s --pretty=raw. If you remove the first line, which is the hash of the commit, you get the commit exactly. Now, just prepend commit <size>\0, replacing <size> with the number of bytes following, and run SHA-1 on the whole thing. And that’s it! That’s your hash. So, if the tree it points to, or the parent, or the author or committer, or message change, the commit hash changes. So, you can rest assured that your hash encodes all of those things. On the flip side, if nothing changes, the hash doesn’t change either.

Diffs

A diff is the difference between two commits. diffs can naturally concern one or many files. They can include file additions, file deletions, and file modifications (Git will also report “renaming” if an added file is similar to a deleted file).

Even though commits store a snapshot (i.e. point to a tree), they are often talked about as a diff with respect to their parent (hence the “apply” concept in the discussion of messages above). As a matter of fact, if I don’t include the -s flag in the git show command, I see the diff with respect to the parent by default:

$ git show
commit 4fb9d6b18acf00f7261fc20206de5adafd1f3977
Author: Dimitar Dimitrov <dimitar.dimitrov@cba.mit.edu>
Date:   Thu Oct 9 11:08:31 2025 -0400

    add video support

diff --git a/.gitmodules b/.gitmodules
index 5c05f90..9e711e3 100644
--- a/.gitmodules
+++ b/.gitmodules
@@ -1,3 +1,6 @@
 [submodule "themes/hugo-theme-relearn"]
        path = themes/hugo-theme-relearn
        url = https://github.com/McShelby/hugo-theme-relearn.git
+[submodule "themes/hugo-video"]
+       path = themes/hugo-video
+       url = https://github.com/martignoni/hugo-video.git
diff --git a/hugo.toml b/hugo.toml
index d9afd74..167aaf8 100644
--- a/hugo.toml
+++ b/hugo.toml
@@ -1,7 +1,7 @@
 baseURL = 'https://example.org/'  # I plan to use GitLab's CI/CD, but not sure what this will end up being
 languageCode = 'en-us'
 title = "Dimitar's CBA Blog"
-theme = 'hugo-theme-relearn'
+theme = ['hugo-video', 'hugo-theme-relearn']
 enableGitInfo = true

 [params]
diff --git a/themes/hugo-video b/themes/hugo-video
new file mode 160000
index 0000000..bc3ef03
--- /dev/null
+++ b/themes/hugo-video
@@ -0,0 +1 @@
+Subproject commit bc3ef03fe9c7a33616f1612923cf124681e61ce2

This is perhaps not a great commit for this example, since there’s a lot going on. But you can see that I modified the .gitmodules file by adding three lines (they start with +). I also modified hugo.toml by replacing one line. Finally, I created a new file, themes/hugo-video (which happens to be a submodule; more on that later).

Let’s look at an easier example. This time, instead of getting the diff implicitly inside git show of a commit, I will diff two commits directly:

$ git diff 4ca660e 7787845
diff --git a/content/tools/_index.md b/content/tools/_index.md
new file mode 100644
index 0000000..c4e8a29
--- /dev/null
+++ b/content/tools/_index.md
@@ -0,0 +1,9 @@
++++
+title = "Tools"
+description = "My favorite tools"
+weight = 3
++++
+
+This section contains documentations for tools that I love.
+It is written for nobody in particular,
+but may be useful to somebody.
diff --git a/hugo.toml b/hugo.toml
index 8290a0a..6f7f89d 100644
--- a/hugo.toml
+++ b/hugo.toml
@@ -2,3 +2,6 @@ baseURL = 'https://example.org/'  # I plan to use GitLab's CI/CD, but not sure w
 languageCode = 'en-us'
 title = "Dimitar's CBA Blog"
 theme = 'hugo-theme-relearn'
+
+[params]
+themeVariant = 'zen-dark'

The difference between those two points in history is that I added some lines to hugo.toml and added the new file content/tools/_index.md b/content/tools/_index.md.

If I flip the order, the difference will be the opposite:

$ git diff 7787845 4ca660e
diff --git a/content/tools/_index.md b/content/tools/_index.md
deleted file mode 100644
index c4e8a29..0000000
--- a/content/tools/_index.md
+++ /dev/null
@@ -1,9 +0,0 @@
-+++
-title = "Tools"
-description = "My favorite tools"
-weight = 3
-+++
-
-This section contains documentations for tools that I love.
-It is written for nobody in particular,
-but may be useful to somebody.
diff --git a/hugo.toml b/hugo.toml
index 6f7f89d..8290a0a 100644
--- a/hugo.toml
+++ b/hugo.toml
@@ -2,6 +2,3 @@ baseURL = 'https://example.org/'  # I plan to use GitLab's CI/CD, but not sure w
 languageCode = 'en-us'
 title = "Dimitar's CBA Blog"
 theme = 'hugo-theme-relearn'
-
-[params]
-themeVariant = 'zen-dark'

diffs are important because they can also be used as “patches”. Patches are a concept that pre-dates Git, and is a hack-y way of distributing changes to a codebase. When you git apply a commit, you’re actually applying the patch that results from its inherent diff.

Remotes

Remotes are what enables collaboration. Remotes are simply another repository, on someone else’s computer (or even on your own computer, in another location). Generally, the idea is that they represent the same codebase, but technically speaking that’s not even a requirement. The way you interact with remotes is you can get commits from them (“fetching” or “pulling”), or you can give them commits (“pushing”). Then, you can use those commits as parents for new commits. It should be clear from the discussion above that there’s no risk of overwriting anything. You can think of commits just existing in a big bucket, with references to each other through the parent field.

You can add remotes to your repository using git remote add. Each remote has to have a unique name from your repository’s perspective.

Git allows communication with your remotes though many protocols. The best one to use is SSH. HTTPS is also common. HTTP shouldn’t be used. For remotes on your local computer, Git just uses your filesystem. There are others, too.

Nothing about Git mandates any sort of organization between remotes, but generally speaking, most people use a “hub and spoke approach”. You have one central repo, colloquially known as origin, and everyone only fetches from and pushes to origin. This way, origin acts as a sort of “source of truth”.

Hosting Services

A number of services exist for hosting Git repos to play the role of origin. GitHub is a common one. Others include BitBucket and SourceForge. We use GitLab, because they let you run a server locally (for free).

Note that in addition to hosting a repo, these services usually have many other features. The most important one for collaboration is the Merge Request (a.k.a. Pull Request) feature.

References

Obviously, keeping track of a million hashes is not a convenient system. Most of the time, we don’t keep track of the hashes themselves, but use “references”.

One special reference is HEAD. This is the “latest” commit. When we ran git show above, that was the same as running git show HEAD. When you create a new commit, the current HEAD becomes its parent, and the new commit becomes the new HEAD. Easy!

If you want to talk about HEAD’s parent, you can use HEAD^, HEAD~ or HEAD~1, if you want to talk about its parent, use can use HEAD^^, HEAD~~ or HEAD~2, and so on. As a matter of fact, you can stick these suffixes to bare hashes or to any of the other references below!

Another type of reference is the “tag”. Tags are like a nickname for a commit. The idea is that once you tag a commit, that tag will remain with that commit forever (you can change them of course, but you shouldn’t). You can pull and push tags to make sure everyone is on the same page about them. They are great for releases, where you want to know exactly what the state of the world was when you, for example, build and distributed a piece of software, or submitted a draft of a paper for review.

Perhaps the most common type of reference is the “branch”. Like a tag, that’s a name referring to a commit. But unlike a tag, it’s not meant to forever refer to one particular commit. Branches are meant to track progress. By convention, the main branch tracks the latest stable version of a codebase, but this is something that varies from organization to organization and even from repo to repo.

One nice thing about branches is that you can be “on” a branch. When you’re “on” a branch, two things happen:

  • Your HEAD is whatever the branch is pointing to
  • When you commit, not only does HEAD point to the new commit, but now your branch points to it too!

Working on a branch is the most common way to work. You can also not do that (called a detached HEAD state), but then you’ll have a hard time keeping track of any new commits you make. That’s great for looking around, but beware losing your commits (when I say losing here, I just mean losing track of the commit hash; nothing is ever lost in Git, and even the hash is not that hard to track down using the reflog as I mentioned before).

Unlike tags, your local branches may not match the branches on remotes. When you fetch from a remote, you can see which commits their branches are pointing to. As one common example, you may have a copy of main, which will just be referred to as main, and origin may have a copy of main, which will be referred to as origin/main.

One final note about branches, which is often a point of confusion. You can be “tracking” a remote branch. This is basically an optional field for every local branch you have. For now, it doesn’t mean a whole lot, except that you get nice messages when things diverge. For instance, when I run git status:

$ git status
On branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
  (use "git pull" if you want to integrate the remote branch with yours)

Oh boy, I need to clean that up!

Later, we will see that it’s also convenient for the git push and git pull commands.

Configuration

Git uses a key-value configuration system. Configurations can be system-wide. Those are stored in /etc/gitconfig. They can also be user-specific, which takes precedence over system-wide configurations. Those are stored in ~/.gitconfig. Finally, they can be per-repo, which takes precedence over user-specific configurations. Those are stored in .git/config. You can find things like branch and remote settings in that last one.

One particularly neat configuration is the alias. You can create custom commands that map to other git commands or even to non-git command!

(I track my .gitconfig here.)

Basic Commands

Now let’s talk about some common commands. Git has two sets of commands. “Plumbing” commands are stable, ancient, and annoying. Use them in scripts. “Porcelain” commands are more user-friendly, but may change slightly from version to version. Use them on the CLI. You can get the full manuals for all of the commands below and more using git help <command> or online at https://git-scm.com/docs.

Okay, enough philosophizing. How do you actually Git?

git init

You won’t use this command often, but it’s a good place to start. This is how you make a new repo. It creates a .git subdirectory in your current directory, and that directory becomes the root of your repo. Any files in there start as untracked, and you don’t have a HEAD – so your first commit will be a parent-less initial commit.

git clone

More often, you will “clone” a directory from somewhere else. Add the reference after clone. For example, to clone this repo:

$ git clone ssh://git@gitlab.cba.mit.edu:846/dimitar/blog.git
Cloning into 'blog'...
remote: Enumerating objects: 321, done.
remote: Counting objects: 100% (41/41), done.
remote: Compressing objects: 100% (40/40), done.
remote: Total 321 (delta 9), reused 0 (delta 0), pack-reused 280 (from 1)
Receiving objects: 100% (321/321), 5.31 MiB | 12.92 MiB/s, done.
Resolving deltas: 100% (90/90), done.

git clone will initialize your local repo, automatically create a remote called origin with that reference, and then fetch from it. It will also create a main branch for you tracking origin/main, and it will start you up to date with it.

git status

This is probably the command you will use the most.

$ git status
On branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
  (use "git pull" if you want to integrate the remote branch with yours)

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   _index.md

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        _index.md.bak

no changes added to commit (use "git add" and/or "git commit -a")

We see a few things here. First, we are told we are on the main branch. (If we were in a detached HEAD state, it would tell us here.) Because main is set up to track origin/main in .git/config, we are given some helpful information on how that’s going. Next, we get a list of files that Git knows about that we have modified with respect to HEAD. The, we get a list of files that Git doesn’t know about. Git doesn’t touch those.

Technically, there is a third category of files. Those are “ignored” files, and you can get a list of them as well:

$ git status --ignored
On branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
  (use "git pull" if you want to integrate the remote branch with yours)

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   _index.md

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        _index.md.bak

Ignored files:
  (use "git add -f <file>..." to include in what will be committed)
        ../../../.hugo_build.lock
        ._index.md.swp
        ../../../public/

no changes added to commit (use "git add" and/or "git commit -a")

The way you get files to be ignored is by dropping a .gitignore file in the base directory, listing what file patterns you want ignored. This is useful for things like swapfiles and build artifacts. There are pre-populated .gitignore files on the internet available for various languages (for example, from GitHub).

git add

git add is used to “stage” a file for committing. You can stage any file in the Changes not staged for commit list, or you can stage a file from Untracked files. You can also stage ignored files if you really want to with the -f flag, but that’s rare. You can even stage parts of a file with the -p flag!

For example, if I run git add _index.md, it moves _index.md to a fourth category called Changes to be committed:

$ git add _index.md
$ git status
On branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
  (use "git pull" if you want to integrate the remote branch with yours)

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   _index.md

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        _index.md.bak

no changes added to commit (use "git add" and/or "git commit -a")

And if I keep making changes to _index.md before committing:

$ git status
On branch main
Your branch and 'origin/main' have diverged,
and have 21 and 39 different commits each, respectively.
  (use "git pull" if you want to integrate the remote branch with yours)

Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        modified:   _index.md

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   _index.md

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        _index.md.bak

no changes added to commit (use "git add" and/or "git commit -a")

Staging is one of the concepts that feels like a gotcha at first, but without it, you’d be stuck jumping through hoops when you want to create a commit that doesn’t include everything in your directory.

git commit

git commit creates a commit using your staged files. Any modifications that are not staged are not committed. Also, any untracked files that are not staged are not committed.

When you run git commit just like that, an editor opens up for you to write your commit message. You have to “save and quit” the document once you write your message. (If your editor is vi or vim, you can edit with the i key, then stop editing with Esc and save-and-quit with ZZ. You can specify your editor with the core.editor configuration key.) If you leave the message blank, it will cancel the commit.

If you don’t want to deal with the editor and only intend to do a one-line commit message, you can add the message to the command with the -m flag:

git commit -m "make changes"

Then, the editor won’t open.

Another great shortcut is -a:

git commit -a

This automatically adds everything in the Changes not staged for commit for commit list, but not anything from the Untracked files or Ignored files lists.

You can use both:

git commit -a -m "make changes"

Or if you really want to save keystrokes:

git commit -am "make changes"

Warning – you will get in the habit of doing -am, and forget to add new files when you makes them! Run git status often to get a lay of the land.

git diff

We already saw git diff above. Used like that, it can produce a diff between two commits. Using it with a single commit, git diff <commit>, is the same as saying git diff <commit> HEAD.

Using it without any arguments does something slightly different. It shows you what changes you currently have in flight that are not staged for commit. Finally, git diff --cached will, confusingly enough, show you what you have staged.

git show

This is another one we have seen. It shows you a commit. I don’t have more to add than what I wrote in the concepts section.

git branch

git branch will list the local branches. git branch -r will list the remote branches. git branch -a will list both.

git branch -c will create a new branch pointing to HEAD.

git branch -d will delete the current branch (just the branch “name” – the commits never go anywhere). git branch -d <name> will delete branch <name>. git branch -D <name> will delete branch <name> even if Git thinks that’s a bad idea.

git branch -m <new-name> renames the current branch to <new-name>. git branch -m <old-name> <new-name> renames branch <old-name> to <new-name>.

git switch

This command will move you to a different branch than on one you’re on. If you want to create a branch and move to it you can do:

$ git branch foo
$ git switch foo

or as a shortcut

$ git switch -c foo

You can also go to a specific reference in a “detached HEAD” state:

$ git swictch -d <ref>

(see discussion of References above).

git log

Running git log is like running git show -s, then running it for the parent, then for its parent, and so on until the initial commit (like many git commands, it gets wrapped in less, so it won’t blow up your terminal).

git log is not terribly useful bare, but it has some fun flags. I like to run it like this:

$ git log --oneline --decorate --graph --all

Except that’s a lie. I have an alias for the above in my .gitconfig called graphall, so I just run

$ git graphall

Try it, you’ll love it.

git fetch

This updates the references to remote branches, and downloads any commits, trees and blobs it needs to make that happen.

By default, I believe it will only try to update the reference to the remote branch which your current branch is tracking, and default to origin if that’s not a thing. I usually run it with the --all though to get everything, because why not. Note that since this only updates remote references, it doesn’t change anything about the current state or about your local branches.

Also, it doesn’t remove references to remote branches which no longer exist on the remote. If you want to get rid of them, add -p.

git pull

This command first runs git fetch for the branch you’re tracking. Then it tries to update your branch. If the remote is “ahead” (meaning, the commit your local branch is currently on is part of the history of the commit that the remote branch is currently on) it will simply move your branch to the remote reference and you’re up to date! If the remote is “behind” (meaning, the commit the remote branch is currently on is part of the history of the commit that your local branch is currently on) it won’t do anything.

If neither is true, the branches have “diverged”. This means that you made some commits locally and someone else made some commits on the remote branch. At this point, git pull will either git merge or git rebase. Merging is the absolute worst, but unfortunately I think that’s still git pull’s default behavior. To use rebase instead, add --rebase or, even better, --rebase=interactive. (The docs call rebasing dangerous because it re-writes history, but you’re going to be rewritting history which hasn’t been pushed, so it’s okay in my book.) More on rebasing later.

This is the single biggest headache with git! (It comes from the fact that git gives you the freedom to do this thing which you really shouldn’t do). I’m going to write a section below that shows you how to always avoid this situation. If you do find yourself in this situation, though, just remember: as long as you’ve committed something in git, it will never be lost. Just make sure you commit before you pull (git might not even let you pull if you have uncommitted changes – I don’t know, I’ve never tried.)

git push

This command first runs git fetch for the branch you’re tracking. Then it tries to update the remote branch. If you’re “ahead” of the remote, it will move its branch reference to your branch reference and upload anything required to do so. Then, it will do git fetch again to make sure your remote reference matches the remote’s new branch reference.

If the remote is “behind” or if the branches have “diverged”, it will error out. git push doesn’t try to do any merging or rebasing like git pull does. You can run git push -f which will just force the remote to match you no matter what. To do this, you have to enable force-pushes on the remote side first.

git checkout

If you’ve read other git documentation in the past, you’ll notice that I haven’t mentioned git checkout yet. I don’t plan to. git checkout is bad because it does a million unrelated things with very similar-looking syntax. There is nothing you need git checkout for that you can’t do with other commands (like git switch and git restore, which were introduced to replace git checkout ).

That’s it for now! I will add a section of more commands I think are helpful. Some honorable mentions:

git apply
git blame
git cat-file
git cherry-pick
git clean
git config
git fsck
git gc
git mv
git rebase
git reflog
git remote
git reset
git restore
git revert
git rm
git stash
git submodule
git tag
git worktree

How to avoid dealing with diverged branches

As mentioned above, diverged branches can be a pain. There are two workflows that can side-step the problem completely.

Workflow 1

This workflow is simple:

  1. Do you work on main
  2. When you feel good about your work, push to origin/main

This is the worse of the two workflows, but I want to recognize that sometimes it wins on simplicity. Only do this if you are working alone and never work in parallel.

Also, don’t do any history rewritting past origin/main. If you really need to edit something that’s already been pushed, enable force-pushing (e.g. in GitLab Settings > Respository > Protected branches > main > Allowed to foce push), push with git push -f, then immediately disable the option again.

If you need to switch computers or work on a remote server, make sure main on your first computer is up to date with origin/main (e.g. git push) then make sure main on your second computer is up to date with origin/main (e.g. git pull), then only work on the second computer until it’s time to switch back.

This is convenient for small projects, but quickly becomes untenable for larger projects and is always a bad idea if other folks are involved (even if they’re just reading your work – if you end up force-pushing, they may not be able to find your work anymore).

Workflow 2

Never work on main. When you start working on a feature, start on main and git pull from origin/main. Then, make a new branch. I find it nice to preface the branch name with something that identifies you (right now, I’m on a branch called dsd36/git, because dsd36 is my Kerberos; the / is cosmetic only). As a reminder, you can make a new branch and switch to it using git switch -c dsd36/git.

Then, you can git push to origin at will. The first time you try to run git push on your new branch, you’ll see something like this:

$ git push
fatal: The current branch dsd36/git has no upstream branch.
To push the current branch and set the remote as upstream, use

    git push --set-upstream origin dsd36/git

To have this happen automatically for branches without a tracking
upstream, see 'push.autoSetupRemote' in 'git help config'.

You see this because your new branch is not tracking anything yet, and there is no branch named dsd36/git on origin, This is a precaution (that can be disabled). Simply run the command that git helpfully suggested the first time: git push --set-upstream origin dsd36/git.

As you work, feel free to git push -f to your branch if you rewrite history locally. Note that GitLab and GitHub only protect main from force pushes by default (I know, I didn’t invent this workflow :) ). If rewritting history is not your cup of tea, then you don’t have to worry about it because nobody else should be touching your branch. If others are working in parallel, they should be working on their own branches.

When you feel good about your feature, it’s time to get it into main. Follow these steps:

  1. Make sure there are no uncommitted changes on your branch (git commit -am "make changes")
  2. Switch to main (git switch main)
  3. Pull the latest main (git pull main). Note that because you’re not working on main, and nobody is force-pushing to main, you won’t have divergent branches.
  4. Switch back to your branch (git switch dsd36/git)
  5. Rebase onto main. I haven’t talked about rebasing yet, but here are the basic steps:
    1. Run git rebase -i main
    2. Save-and-quit the file that shows up using e.g. ZZ if you’re in vim
    3. Wait for it to run. For most commits, git will be able to auto-merge
    4. If you get any conflicts (you and someone else edited the same line), then for each conflicting file:
      1. Open the file
      2. Look for the conflicts (marked with <<<<<<< ======= >>>>>>>)
      3. Fix the conflicts
      4. Close the file
      5. git add <the file>
    5. If you had to fix conflicts, continue with git rebase --continue
    6. If instead you panic, git rebase --abort, take a break, try again
  6. Push your branch (git push). If your rebase wasn’t a no-op (i.e. main changes since you originally branched from it), and you had previously pushed, then you will have to do the dreaded git push -f here.
  7. Go to GitLab and create a Merge Request (or go to GitHub and create a Pull Request; same thing, worse name)
  8. Merge your change into main in the UI
    1. Bonus points for rebasing it onto main, which is something you can set up in the Repository Settings
    2. The merge here won’t be painful because your branch is going to be ahead of main thanks to the rebase you already did

Note that if there are real merge conflicts, you still end up having to solve them, no way around that. But, you do it on your branch, in the commits that create the issue as they come up, which is a lot more logical and much easier than solving everything in a hairy Merge commit.

If you follow this workflow, there is actually another setting in the Repository Settings that completely disables pushing to main. Better yet, there are a million CI/CD features you can take advantage of using this workflow. For instance, you can set up Merge Request / Pull Request approvals. This can be an informal thing with your team, if you just want eyes on your work, or you can use the Repository Settings to enforce an approval before a request is merged. Some repositories my require one or two approvals by anyone on the team. Other repositories may have a specific list of people who have to approve a request before merging (called “Code Owners”). The UI makes it easy to review changes. So easy in fact, I use it to review my own code! You can also set up pipelines that need to pass before a merge is possible (e.g. a set of tests to make sure that main doesn’t break after the merge). Or, you can set up actions that happen after the merge (e.g. publishing a website). The possibilities are endless.

All this may seem like a lot, but the whole process can take less than 30 seconds once you do it a few times, and I guarantee that it will save time in the long run.