git recitation (htm 2024)

2024-09-12

slides

This is documentation for the version control recitation for how to make almost anything 2024.

Members of the class can reach me for help at my MIT email (kerb = npry) or on the class Discord.

references

Git Book: great cover of git in some depth

This recitation is based in part on Camron's version from last year

cheatsheet

ssh keys (authentication for gitlab)

If you don't have one already:

# can leave password empty at the prompt
$ ssh-keygen

# open gitlab -> click on your user icon -> preferences -> ssh keys -> add new key
# copy and paste the text after this command into the 'key' box
# then click 'add key'
#
# on modern macos, ssh-keygen may generate a different type of key, so you may have
# id_ed25519.pub or another variant instead; you can use that
$ cat ~/.ssh/id_rsa.pub
ssh-rsa AAA...(truncated) $hostname

# verify that your key works (don't bother on windows unless you know you have
# an ssh client)
$ ssh -T git@gitlab.cba.mit.edu -p 846
Welcome to GitLab, @$username!

install git cli

You almost certainly need to do this, even if you want to use a GUI client. Follow these instructions — let us know if you have trouble.

git configuration, clone

$ git config --global user.name "My Name"

# use an email associated with your gitlab account
$ git config --global user.email "my_email@my_doma.in"

# clone your personal repo
$ git clone ssh://git@gitlab.cba.mit.edu:846/$YOUR_REPO

normal commit flow

# make some edits...

# (from repo root) add all files you've changed
$ git add .

# check what will be committed
$ git status

# commit the changes to the repo
$ git commit -m "<message>"

# push new commit(s) to gitlab
$ git push

i want to change a commit

$ git add .

# alter the _last_ commit with additional changes staged by `git add .`
$ git commit --amend -m "<message>"

# it is very bad manners to do this on a shared branch -- it can break everyone
# else's repos
$ git push -f

For a newcomer, --amend is mildly perilous — prefer to make another commit unless you really have to edit history (e.g. committed a huge file). In this case, probably just reach out to me or another TA.

i want an old version of a file or directory

# find the commit hash
$ git log --graph --oneline

# warning: this will overwrite any local changes you have made to the file
$ git checkout $commit -- $file

help! i broke my repo!

Dumb, easy fix -- just re-clone it and make the changes again:

# reclone repo
$ git clone ssh://git@gitlab.cba.mit.edu:846/$YOUR_REPO repo_copy
# make your changes again, commit, push

help! i deleted a branch / lost commits!

Don't panic. It's hard to lose data in git, so if there's something you need to get back and can't, come find a TA and we can help you.

See also: git reflog.

command reference

git add [files...]

Stage $files for commit. Most commonly, git add . stages all files in the current directory (recursively).

Remember that git add stages a current snapshot of file state to be committed — if you make further changes, you need to git add again.

git status

Show what changes are going to be committed. Recommend always running this before you commit to make sure you're doing what you think you are.

git commit -m [message]

Record a new commit with message $message. In general, you can't make commits without changed files — if you see this, check git status and make sure you've staged what you think you have.

git restore --staged [files...]

Unstage $files -- they won't be committed.

git checkout -- [files...]

Revert changes to unstaged $files. This discards all edits since the last commit -- be careful!

If you need to discard changes to staged files, use git restore --staged ... first.

git checkout <branch>

Switch to an existing branch called $branch.

git checkout -b <branch>

Create and switch to a new branch called $branch.

git log --graph --oneline

Show a graphical layout

class specifics

filesize limits

Pace your uploads -- plan to be storing low single-digit megabytes of data in your repo each week at the beginning of the class, and low double-digit megabytes closer to the end. You will need to resize your media in order to achieve this.

ffmpeg (commandline) supports resizing videos: Neil's cheatsheet is a good reference.

ImageMagick supports resizing images:

$ convert $src_image -resize 720x480 $dst_image

GitLab?

I'm sure you've heard of GitHub; GitLab fills a similar niche. These are both git repo hosts with web frontends — they exist to be collaboration platforms. GitHub is a commercial project owned by Microsoft — GitLab is an open source protocol that can be self-hosted. This is what CBA does.

ui clients

why/what is version control?

Does this look familiar: paper final final revised (Copy).docx? In the language of version control, this document and its versions represent a linear edit history (read bottom-to-top for chronological order):

* fix date                              (-> paper final final revised (Copy).docx)
|
* more revisions                        (-> paper final final revised.docx)
|
* grammar and spelling cleanup          (-> paper final final.docx)
|
* first round of edits, restructuring   (-> paper final.docx)
|
* first draft                           (-> paper.docx)

Version control systems let you explicitly manage this history — rather than copying files manually for backup, you tell a VCS to record a new version of a file, and it records the change in the history for you. You can tell it to go to a different version, and it updates the file tree to look like it did at that version. There are many version control systems, but the one we use in this class (and the most popular for modern software development) is called git.

git concepts

Git operates at the scope of a directory. It stores the old versions of files and file history information in a "repository" (repo), which is just a directory on the filesystem. (Files aren't stored literally, there is a lot of cleverness in git to deduplicate and compress file versions.) To create a repo, run "git init":

$ git init
Initialized empty Git repository in $PWD/.git/
# This is telling you that git will store the data for the repo (version,
# history data, etc.) in the `.git` directory.

A "version" in git is called a "commit". When you record a new version of your files in git, you are "committing" them to the repo:

$ git add my_file
$ git commit -m "add my file"
[master 12a8bef] add my file                     # 12a8bef is the commit id (hash)
 1 file changed, 0 insertions(+), 0 deletions(-) # new file, but it's empty
 create mode 100644 my_file

This is a two-step process:

# make a change to my_file
$ echo foo > my_file

# create two new files
$ echo bar > another_file
$ echo qux > ignored_file

$ tree
.
├── my_file
├── another_file
└── ignored_file

# only add another_file to be committed -- a `commit` right now will not
# record ignored_file or changes to my_file!
$ git add another_file

# show files:
# - that will be committed ("staged")
# - that won't ("not staged")
# - that git hasn't seen before ("untracked")
$ git status
On branch master
Changes to be committed:
  (use "git restore --staged <file>..." to unstage)
        new file:   another_file

Changes not staged for commit:
  (use "git add <file>..." to update what will be committed)
  (use "git restore <file>..." to discard changes in working directory)
        modified:   my_file

Untracked files:
  (use "git add <file>..." to include in what will be committed)
        ignored_file
$ git commit -m "create another important file"
[master 5552b2b] create another important file
 1 file changed, 1 insertions(+), 0 deletions(-)
 create mode 100644 another_file

You can view the history with git log:

$ git log
commit 5552b2bebd5e185db2891ba2eb0fc91502e52780
Author: Nathan Perry <np@npry.dev>
Date:   Mon Sep 9 14:10:23 2024 -0400

    create another important file

commit 12a8befb21236ae527aa0cffbe59384b5493238b
Author: Nathan Perry <np@npry.dev>
Date:   Mon Sep 9 13:52:23 2024 -0400

    add my file

I find this output is more readable with git log --oneline:

$ git log --oneline
5552b2bebd5e185db2891ba2eb0fc91502e52780 create another important file
12a8befb21236ae527aa0cffbe59384b5493238b add my file

branching

I mentioned in the first example that we were looking at a "linear" commit history. This is by contrast to a "branching" commit history. Git supports multiple concurrent versions of the repo, in separate "branches".

A git branch is just a name that points to a commit. When you are "on" a branch, we call that having the branch "checked out". As you make commits, the branch updates to point to the most recent commit.

You can switch branches with git checkout $branch, and you can create a branch with git checkout -b $new_branch. When you create a new branch this way, it starts out pointing to the commit you currently have checked out.

remotes

So far, the git we've described has been completely local — a way for you to store and browse the history of a project. Git also supports (indeed, is built for) collaboration with others. The most common way to accomplish this is by adding remote repositories:

# configure a remote called "origin" at the given address
$ git remote add origin ssh://git@gitlab.cba.mit.edu:846/$YOUR_REPO

# print configured remotes
$ git remote -v
origin  ssh://git@gitlab.cba.mit.edu:846/$YOUR_REPO (fetch)
origin  ssh://git@gitlab.cba.mit.edu:846/$YOUR_REPO (push)

# "push" (send) your commits to the master branch on origin
# associating (-u) the local master branch with the remote's
$ git push -u origin master

# "pull" (receive) any new commits from the remote's master branch
$ git pull origin master

clone downloads an existing remote repo and creates a local one based on it (with remote origin configured to point at the remote):

# create a new git repo, configure origin with the given url,
# and pull the default branch (main or master, depending)
$ git clone ssh://git@gitlab.cba.mit.edu:846/$YOUR_REPO

$ cd $YOUR_REPO

# print configured remotes
$ git remote -v
origin  ssh://git@gitlab.cba.mit.edu:846/$YOUR_REPO (fetch)
origin  ssh://git@gitlab.cba.mit.edu:846/$YOUR_REPO (push)

Since your GitLab repos are already created, you should clone them.