5 Git and Github
First thing to get familiar with is Git and Github, they will be your new best friends! Make sure you have gone through [#R, Rstudio, Git and Github] before continuing on.
5.1 Git
Git
is a version control system that among other things lets you manage and keep track of your source code history. We useGit
everyday to keep track of of the edits we make allowing us to work on the same projects at the same time and facilitating rolling back of changes when necessary.Happy with Git is a great reference for getting started with integrating
Git
andGithub
withR
andRmarkdown
throughRstudio
.Git
commands are used in the terminal (orGit
window of Rstudio) to perform a variety of functions. The officialgit
documents are stored here with a few key ones from the terminal below to get you started.
Here are some git
commands to become familiar with:
git init
- initializes local git repogit add <filename>
- adds file to staging areagit add .
-adds all files to staging areagit status
-check status of working tree (files in staging area)git commit
- puts everything from staging area into local repogit push
- moves local repo to remote repogit pull
- moves latest remote repo to local repogit clone
- clones repo into a new directorygit branch
- adds branchgit help -a/c/g
etc. adding a tag will pull up specific help info,-a
will bring up all help,-c
configuration info, more tag info is available here- More commands here Top 20 git commands with examples
Note - use the Show/Hide Code
button top right to see the chunk inputs used to generate the chunk outputs:
Click here to see the Rmarkdown chunk output of git help
when we set the chunk engine.option to bash
.
## usage: git [-v | --version] [-h | --help] [-C <path>] [-c <name>=<value>]
## [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path]
## [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--bare]
## [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>]
## [--super-prefix=<path>] [--config-env=<name>=<envvar>]
## <command> [<args>]
##
## These are common Git commands used in various situations:
##
## start a working area (see also: git help tutorial)
## clone Clone a repository into a new directory
## init Create an empty Git repository or reinitialize an existing one
##
## work on the current change (see also: git help everyday)
## add Add file contents to the index
## mv Move or rename a file, a directory, or a symlink
## restore Restore working tree files
## rm Remove files from the working tree and from the index
##
## examine the history and state (see also: git help revisions)
## bisect Use binary search to find the commit that introduced a bug
## diff Show changes between commits, commit and working tree, etc
## grep Print lines matching a pattern
## log Show commit logs
## show Show various types of objects
## status Show the working tree status
##
## grow, mark and tweak your common history
## branch List, create, or delete branches
## commit Record changes to the repository
## merge Join two or more development histories together
## rebase Reapply commits on top of another base tip
## reset Reset current HEAD to the specified state
## switch Switch branches
## tag Create, list, delete or verify a tag object signed with GPG
##
## collaborate (see also: git help workflows)
## fetch Download objects and refs from another repository
## pull Fetch from and integrate with another repository or a local branch
## push Update remote refs along with associated objects
##
## 'git help -a' and 'git help -g' list available subcommands and some
## concept guides. See 'git help <command>' or 'git help <concept>'
## to read about a specific subcommand or concept.
## See 'git help git' for an overview of the system.
Click here to see the Rmarkdown chunk output from adding one of the -a
, c
or g
options to pull up specific help info (-a
will bring up all help, -c
configuration info) (ex. on command line - git help -a
).
# more detail with the -a option
# with chunk engine option set to bash - command would be git help -a
system2(command = "git", args = c("-h", "-a"), stdout = TRUE)
## [1] "See 'git help <command>' to read about a specific subcommand"
## [2] ""
## [3] "Main Porcelain Commands"
## [4] " add Add file contents to the index"
## [5] " am Apply a series of patches from a mailbox"
## [6] " archive Create an archive of files from a named tree"
## [7] " bisect Use binary search to find the commit that introduced a bug"
## [8] " branch List, create, or delete branches"
## [9] " bundle Move objects and refs by archive"
## [10] " checkout Switch branches or restore working tree files"
## [11] " cherry-pick Apply the changes introduced by some existing commits"
## [12] " citool Graphical alternative to git-commit"
## [13] " clean Remove untracked files from the working tree"
## [14] " clone Clone a repository into a new directory"
## [15] " commit Record changes to the repository"
## [16] " describe Give an object a human readable name based on an available ref"
## [17] " diff Show changes between commits, commit and working tree, etc"
## [18] " fetch Download objects and refs from another repository"
## [19] " format-patch Prepare patches for e-mail submission"
## [20] " gc Cleanup unnecessary files and optimize the local repository"
## [21] " gitk The Git repository browser"
## [22] " grep Print lines matching a pattern"
## [23] " gui A portable graphical interface to Git"
## [24] " init Create an empty Git repository or reinitialize an existing one"
## [25] " log Show commit logs"
## [26] " maintenance Run tasks to optimize Git repository data"
## [27] " merge Join two or more development histories together"
## [28] " mv Move or rename a file, a directory, or a symlink"
## [29] " notes Add or inspect object notes"
## [30] " pull Fetch from and integrate with another repository or a local branch"
## [31] " push Update remote refs along with associated objects"
## [32] " range-diff Compare two commit ranges (e.g. two versions of a branch)"
## [33] " rebase Reapply commits on top of another base tip"
## [34] " reset Reset current HEAD to the specified state"
## [35] " restore Restore working tree files"
## [36] " revert Revert some existing commits"
## [37] " rm Remove files from the working tree and from the index"
## [38] " scalar A tool for managing large Git repositories"
## [39] " shortlog Summarize 'git log' output"
## [40] " show Show various types of objects"
## [41] " sparse-checkout Reduce your working tree to a subset of tracked files"
## [42] " stash Stash the changes in a dirty working directory away"
## [43] " status Show the working tree status"
## [44] " submodule Initialize, update or inspect submodules"
## [45] " switch Switch branches"
## [46] " tag Create, list, delete or verify a tag object signed with GPG"
## [47] " worktree Manage multiple working trees"
## [48] ""
## [49] "Ancillary Commands / Manipulators"
## [50] " config Get and set repository or global options"
## [51] " fast-export Git data exporter"
## [52] " fast-import Backend for fast Git data importers"
## [53] " filter-branch Rewrite branches"
## [54] " mergetool Run merge conflict resolution tools to resolve merge conflicts"
## [55] " pack-refs Pack heads and tags for efficient repository access"
## [56] " prune Prune all unreachable objects from the object database"
## [57] " reflog Manage reflog information"
## [58] " remote Manage set of tracked repositories"
## [59] " repack Pack unpacked objects in a repository"
## [60] " replace Create, list, delete refs to replace objects"
## [61] ""
## [62] "Ancillary Commands / Interrogators"
## [63] " annotate Annotate file lines with commit information"
## [64] " blame Show what revision and author last modified each line of a file"
## [65] " bugreport Collect information for user to file a bug report"
## [66] " count-objects Count unpacked number of objects and their disk consumption"
## [67] " diagnose Generate a zip archive of diagnostic information"
## [68] " difftool Show changes using common diff tools"
## [69] " fsck Verifies the connectivity and validity of the objects in the database"
## [70] " gitweb Git web interface (web frontend to Git repositories)"
## [71] " help Display help information about Git"
## [72] " instaweb Instantly browse your working repository in gitweb"
## [73] " merge-tree Perform merge without touching index or working tree"
## [74] " rerere Reuse recorded resolution of conflicted merges"
## [75] " show-branch Show branches and their commits"
## [76] " verify-commit Check the GPG signature of commits"
## [77] " verify-tag Check the GPG signature of tags"
## [78] " version Display version information about Git"
## [79] " whatchanged Show logs with difference each commit introduces"
## [80] ""
## [81] "Interacting with Others"
## [82] " archimport Import a GNU Arch repository into Git"
## [83] " cvsexportcommit Export a single commit to a CVS checkout"
## [84] " cvsimport Salvage your data out of another SCM people love to hate"
## [85] " cvsserver A CVS server emulator for Git"
## [86] " imap-send Send a collection of patches from stdin to an IMAP folder"
## [87] " p4 Import from and submit to Perforce repositories"
## [88] " quiltimport Applies a quilt patchset onto the current branch"
## [89] " request-pull Generates a summary of pending changes"
## [90] " send-email Send a collection of patches as emails"
## [91] " svn Bidirectional operation between a Subversion repository and Git"
## [92] ""
## [93] "Low-level Commands / Manipulators"
## [94] " apply Apply a patch to files and/or to the index"
## [95] " checkout-index Copy files from the index to the working tree"
## [96] " commit-graph Write and verify Git commit-graph files"
## [97] " commit-tree Create a new commit object"
## [98] " hash-object Compute object ID and optionally creates a blob from a file"
## [99] " index-pack Build pack index file for an existing packed archive"
## [100] " merge-file Run a three-way file merge"
## [101] " merge-index Run a merge for files needing merging"
## [102] " mktag Creates a tag object with extra validation"
## [103] " mktree Build a tree-object from ls-tree formatted text"
## [104] " multi-pack-index Write and verify multi-pack-indexes"
## [105] " pack-objects Create a packed archive of objects"
## [106] " prune-packed Remove extra objects that are already in pack files"
## [107] " read-tree Reads tree information into the index"
## [108] " symbolic-ref Read, modify and delete symbolic refs"
## [109] " unpack-objects Unpack objects from a packed archive"
## [110] " update-index Register file contents in the working tree to the index"
## [111] " update-ref Update the object name stored in a ref safely"
## [112] " write-tree Create a tree object from the current index"
## [113] ""
## [114] "Low-level Commands / Interrogators"
## [115] " cat-file Provide content or type and size information for repository objects"
## [116] " cherry Find commits yet to be applied to upstream"
## [117] " diff-files Compares files in the working tree and the index"
## [118] " diff-index Compare a tree to the working tree or index"
## [119] " diff-tree Compares the content and mode of blobs found via two tree objects"
## [120] " for-each-ref Output information on each ref"
## [121] " for-each-repo Run a Git command on a list of repositories"
## [122] " get-tar-commit-id Extract commit ID from an archive created using git-archive"
## [123] " ls-files Show information about files in the index and the working tree"
## [124] " ls-remote List references in a remote repository"
## [125] " ls-tree List the contents of a tree object"
## [126] " merge-base Find as good common ancestors as possible for a merge"
## [127] " name-rev Find symbolic names for given revs"
## [128] " pack-redundant Find redundant pack files"
## [129] " rev-list Lists commit objects in reverse chronological order"
## [130] " rev-parse Pick out and massage parameters"
## [131] " show-index Show packed archive index"
## [132] " show-ref List references in a local repository"
## [133] " unpack-file Creates a temporary file with a blob's contents"
## [134] " var Show a Git logical variable"
## [135] " verify-pack Validate packed Git archive files"
## [136] ""
## [137] "Low-level Commands / Syncing Repositories"
## [138] " daemon A really simple server for Git repositories"
## [139] " fetch-pack Receive missing objects from another repository"
## [140] " http-backend Server side implementation of Git over HTTP"
## [141] " send-pack Push objects over Git protocol to another repository"
## [142] " update-server-info Update auxiliary info file to help dumb servers"
## [143] ""
## [144] "Low-level Commands / Internal Helpers"
## [145] " check-attr Display gitattributes information"
## [146] " check-ignore Debug gitignore / exclude files"
## [147] " check-mailmap Show canonical names and email addresses of contacts"
## [148] " check-ref-format Ensures that a reference name is well formed"
## [149] " column Display data in columns"
## [150] " credential Retrieve and store user credentials"
## [151] " credential-cache Helper to temporarily store passwords in memory"
## [152] " credential-store Helper to store credentials on disk"
## [153] " fmt-merge-msg Produce a merge commit message"
## [154] " hook Run git hooks"
## [155] " interpret-trailers Add or parse structured information in commit messages"
## [156] " mailinfo Extracts patch and authorship from a single e-mail message"
## [157] " mailsplit Simple UNIX mbox splitter program"
## [158] " merge-one-file The standard helper program to use with git-merge-index"
## [159] " patch-id Compute unique ID for a patch"
## [160] " sh-i18n Git's i18n setup code for shell scripts"
## [161] " sh-setup Common Git shell script setup code"
## [162] " stripspace Remove unnecessary whitespace"
## [163] ""
## [164] "User-facing repository, command and file interfaces"
## [165] " attributes Defining attributes per path"
## [166] " cli Git command-line interface and conventions"
## [167] " hooks Hooks used by Git"
## [168] " ignore Specifies intentionally untracked files to ignore"
## [169] " mailmap Map author/committer names and/or E-Mail addresses"
## [170] " modules Defining submodule properties"
## [171] " repository-layout Git Repository Layout"
## [172] " revisions Specifying revisions and ranges for Git"
## [173] ""
## [174] "Developer-facing file formats, protocols and other interfaces"
## [175] " format-bundle The bundle file format"
## [176] " format-chunk Chunk-based file formats"
## [177] " format-commit-graph Git commit-graph format"
## [178] " format-index Git index format"
## [179] " format-pack Git pack format"
## [180] " format-signature Git cryptographic signature formats"
## [181] " protocol-capabilities Protocol v0 and v1 capabilities"
## [182] " protocol-common Things common to various protocols"
## [183] " protocol-http Git HTTP-based protocols"
## [184] " protocol-pack How packs are transferred over-the-wire"
## [185] " protocol-v2 Git Wire Protocol, Version 2"
## [186] ""
## [187] "External commands"
## [188] " lfs"
## [189] ""
## [190] "Command aliases"
## [191] " aliases config --get-regexp alias"
## [192] " amend commit --amend --reuse-message=HEAD"
## [193] " branches branch -a"
## [194] " c clone --recursive"
## [195] " ca !git add -A && git commit -av"
## [196] " contributors shortlog --summary --numbered"
## [197] " credit !f() { git commit --amend --author \"$1 <$2>\" -C HEAD; }; f"
## [198] " d !git diff-index --quiet HEAD -- || clear; git --no-pager diff --patch-with-stat"
## [199] " di !d() { git diff --patch-with-stat HEAD~$1; }; git diff-index --quiet HEAD -- || clear; d"
## [200] " dm !git branch --merged | grep -v '\\*' | xargs -n 1 git branch -d"
## [201] " fb !f() { git branch -a --contains $1; }; f"
## [202] " fc !f() { git log --pretty=format:'%C(yellow)%h %Cblue%ad %Creset%s%Cgreen [%cn] %Cred%d' --decorate --date=short -S$1; }; f"
## [203] " fm !f() { git log --pretty=format:'%C(yellow)%h %Cblue%ad %Creset%s%Cgreen [%cn] %Cred%d' --decorate --date=short --grep=$1; }; f"
## [204] " ft !f() { git describe --always --contains $1; }; f"
## [205] " go !f() { git checkout -b \"$1\" 2> /dev/null || git checkout \"$1\"; }; f"
## [206] " l log --pretty=oneline -n 20 --graph --abbrev-commit"
## [207] " p git pull --recurse-submodules"
## [208] " reb !r() { git rebase -i HEAD~$1; }; r"
## [209] " remotes remote -v"
## [210] " retag !r() { git tag -d $1 && git push origin :refs/tags/$1 && git tag $1; }; r"
## [211] " s status -s"
## [212] " tags tag -l"
Click here to see the Rmarkdown chunk output when we drill down into the specifics of a command by adding the -help
option after the initial command (ex. git checkout -help
).
# get help on command with the -h option
# on command line this would be git checkout -help but in bash chunk of Rmarkdown git checkout --help (NOTE: two -- required in Rmarkdown when using bash engine.option (and some ffffuuu_u_u_U_unnnnky outputs) due to complexities with iterative programming)
system2(command = "git", args = c("checkout", "-help"), stdout = TRUE)
## [1] "usage: git checkout [<options>] <branch>"
## [2] " or: git checkout [<options>] [<branch>] -- <file>..."
## [3] ""
## [4] " -b <branch> create and checkout a new branch"
## [5] " -B <branch> create/reset and checkout a branch"
## [6] " -l create reflog for new branch"
## [7] " --guess second guess 'git checkout <no-such-branch>' (default)"
## [8] " --overlay use overlay mode (default)"
## [9] " -q, --quiet suppress progress reporting"
## [10] " --recurse-submodules[=<checkout>]"
## [11] " control recursive updating of submodules"
## [12] " --progress force progress reporting"
## [13] " -m, --merge perform a 3-way merge with the new branch"
## [14] " --conflict <style> conflict style (merge, diff3, or zdiff3)"
## [15] " -d, --detach detach HEAD at named commit"
## [16] " -t, --track[=(direct|inherit)]"
## [17] " set branch tracking configuration"
## [18] " -f, --force force checkout (throw away local modifications)"
## [19] " --orphan <new-branch>"
## [20] " new unparented branch"
## [21] " --overwrite-ignore update ignored files (default)"
## [22] " --ignore-other-worktrees"
## [23] " do not check if another worktree is holding the given ref"
## [24] " -2, --ours checkout our version for unmerged files"
## [25] " -3, --theirs checkout their version for unmerged files"
## [26] " -p, --patch select hunks interactively"
## [27] " --ignore-skip-worktree-bits"
## [28] " do not limit pathspecs to sparse entries only"
## [29] " --pathspec-from-file <file>"
## [30] " read pathspec from file"
## [31] " --pathspec-file-nul with --pathspec-from-file, pathspec elements are separated with NUL character"
## [32] ""
## attr(,"status")
## [1] 129
Some files in our repos such as the bcfishpass.sqlite
and .xls
files get logged as a change in Git every time we open the file (even if no changes have been made).
- Run this command if no changes have been made to the sqlite:
git update-index --assume-unchanged data/bcfishpass.sqlite
- Run this command to log a commit, when changes have been made to sqlite:
git update-index --no-assume-unchanged data/bcfishpass.sqlite
5.2 Github
GitHub is a cloud-based hosting service that lets you manage Git repositories.
Github is where you will find all our files, check it out here
Our content is organized into repositories (repo for short), which you can clone to your laptop and then edit the files (see section 2.2 for how to do this)
Here is some Github lingo to get you started:
- Repository (repo) - folder used by git to track all changes of a given project, must have git repo to use git commands
- Commits - a saved version of the code
- Branches - used to work on code without changing the main branch
- Checkout - move between different commits and branches
- Pull requests - let you tell others about changes you’ve pushed to a branch in a repository. Usually a proposal to merge a set of changes from one branch into another
- Issues - a place to document issues with the code, tasks that need to be done, or just a place to document what is happening. See below for more info.
Learn how to set up a Personal Access Token in R
Find lots more info about Github here
5.3 Issues
The issues section on github is a place to document processes, point out bugs in code, suggest enhancements to repos, and keep track of progress.
It is easy for conversations to get lost or forgotten, filing an issue is a great way to ensure problems are prioritized and kept in one place.
Be as descriptive as possible when writing an issue by using keywords, markdown syntax, links, screenshots, etc. when appropriate.
Add labels to issues when appropriate, this will help to categorize and prioritize issues. Checkout gh for info on how we could make tags programmatically with the project setup script.
See Writing Good GitHub Issues for an intro on how to write an issue, with examples.
You can reference issues in commits and Github comments by using #issue number (ex:
#74
)To close an issue in a commit, in the commit message put a keyword in front of the # and issue number (ex:
close #74
). Any of the following keywords are accepted:close
,closes
,closed
,fix
,fixes
,fixed
,resolve
,resolves
,resolved
. The issues will be closed once the branch is pushed/merged on Github, insanely handy.You can close an issue from another repo by adding the repo name ahead of the #issue number, (ex:
fixes onboarding#4
).
5.4 Watching repos
To stay up to date with other team members work, make sure to watch
the repos. This means you will get email notifications about issues and PRs, etc for that repo. Below is a screenshot of how to watch
a repo.
5.5 Commits
A commit functions like a snapshot of all the files in the repo, at a specific moment. Commits can be thought of as “safe” versions of a project, Git will never change them unless you explicitly ask it to.
When working on a project, we will make commits often so that we can always revert back to an older version of the code if we need to, this is the magic of Git!
To learn more about commits and how to make them, check out https://www.atlassian.com/git/tutorials/saving-changes/git-commit
It’s important to use descriptive messages when making commits so that others have an idea of what you did in the commit.
5.7 Your file system
Set up file system on your laptop. It is important to keep a similar file structure between machines so we can share files easier and keep relative paths the same. Below is a screenshot of what your file structure should look like.
Create your file system by:
- First making the appropriate folders (ie. current, gis, repo) on your laptop
- Navigate to the repo you would like on https://github.com/NewGraphEnvironment and create a fork. A fork will allow you to work on the repo individually and when ready you can create a pull request to merge it back to the main branch, often has to be approved.
- Click the green code button and copy the HTTPS.
- In your terminal, navigate to your repo directory and clone the selected repo using
git clone <HHTPS link>
- Repeat this process for all desired repos
5.8 Github Workflow
5.8.1 General Workflow
The way we edit files in repos is very important so that the changes can be merged appropriately. The general workflow is illustrated below with instructions further down.
- Set up the repo on your computer following the instructions here
- Make sure your fork is up to date with the
New Graph Environment
main branch by usinggit pull
or syncing.
- Make sure your fork is up to date with the
After making sure your fork is up to date with
New Graph Environment
main, create a new local branch by usinggit checkout -b <new branch name>
.Make your edits. After each major edit, create a commit by either using your terminal or by using the Git tab in RStudio.
Once you are done editing the file(s), use
git push
to push your branch (local) to Github (remote).On Github, navigate to the branch you just pushed.
- It will now say that this branch is “x-number of commits ahead of
New Graph Environment
main”. To create a Pull Request selectcontribute
and thenopen pull request
- Make sure that your PR is correctly set up, comparing your working branch with the
New Graph Environment
main. Below you will see the commits that are being merged, and if you scroll down further you will see the code that is being changed, cool! If you have already made a PR but want to make more edits, just push to the same branch and it will automatically update that PR with your latest commits (refrain from making multiple PRs). Once you’ve created the PR it will need to be approved byNew Graph Environment
. It will also give you the option to delete the branch you were working in, this is a good practice to get used to doing because it will help you to remember to make a new branch before making future edits.
- Once the PR has been approved, DON’T FORGET to sync your fork with the
New Graph Environment
main before creating a new branch and making more edits!
5.8.2 Getting ahead of main
If you get ahead of main in a branch (your-new-branch
) before say a PR is done you can update your branch by:
- Updating your origin fork with the main branch
- Switching to main and
git pull
- Then switching back to
your-new-branch
and thengit merge
.
This will work as long as your changes are not conflicting with what was done in the PR and subsequent changes to main before you pulled it.