6 R and Rstudio

6.1 Rstudio settings

It’s important for the team to all have the same settings in Rstudio. This will make it easier to help each other out and troubleshoot problems. Some settings can be personalized (ie, editor_theme), but others should be consistent across the team.

  1. Tools/global options/general/basic settings should be the same as the image below. It’s very important to have “Save workspace to .RData on exit” set to Never, and deselect “Restore .RData into workspace at startup”

  2. Another way we maintain the same settings is by all having the same settings in rstudio-prefs.json. You can find this file in Home/.config/rstudio/rstudio-prefs.json, use Cmd+shift+. to see your dotfiles. Replace your current rstudio-prefs.json with the one found in our dotfiles repo. This will give you the same settings as the rest of the team. Make sure to change the "document_author": to your name, and feel free to change the "editor_theme": to your preference.

6.2 Packages

  • Use pak to install packages when possible.

  • To get info about any function or package:

    • To see all functions in the package, run ls("package:package_name").
    • To pull up the documentation for a specific function, run ?function_name.
    • To see the raw function code in the console, run function_name
  • Here are some packages to become familiar with:

  • Introduction to Writing Packages in R

6.3 Helpful Resources

  • R for Data Science - A helpful resource for learning how to get your data into R, organize it, transform it, visualize it, and model it.

  • wtf is a good resource for learning more about R/Rstudio.

6.4 Rstudio Keyboard Shortcuts

  • A list of some of the most common R studio keyboard shortcuts for Mac. Once you get used to these it will make script writing much quicker.

    • Commenting and uncommenting code: Ctrl + Shift + C
    • Pull up shortcuts cheat sheet in R: Alt + Shift + K
    • Insert pipe operator: Ctrl + Shift + M
    • Insert assignment operator: Alt + -
    • Make source pane full screen: Ctrl + Shift + 1
    • Make console pane full screen: Ctrl + Shift + 2 , 3 does help pane, 4 does environment pane.
    • Cursor-select multiple lines: Ctrl + Alt + Up/Down/Click
    • Find in files: Ctrl + Shift + F
    • Refresh R: Ctrl + Shift + F10
    • List recent commands: Ctrl + ↑
    • Execute command on currently selected line: Ctrl + Enter
    • Execute complete script: Ctrl + Shift + S
  • We have also created some of own shortcuts that are specific to our workspace. Here are a few examples:

    • Completely restart Rstudio (not just R) so you can do things like reload shrtcts: Shift+Cmd+9
    • Open Chattr Background Job: Ctrl+C+B
  • You can find these shortcuts in the .shrtcts.R file in our dotfiles repo. To use these shortcuts:

    1. Install the shrtcts package by running pak::pkg_install("gadenbuie/shrtcts").
    2. Save the .shrtcts.R file in your home directory
    3. Add the following code to your .Rprofile file (also located in your home directory):
if (interactive() && requireNamespace("shrtcts", quietly = TRUE)) {
  shrtcts::add_rstudio_shortcuts(set_keyboard_shortcuts = TRUE)
}
  • More info on this can be found in the shrtcts github. Worth noting that weird glitches happen when there are conflicting shortcuts to the same call (.shrtcts.R vs manually added shortcuts in Tools/Modify Keyboard Shortcuts in Rstudio).

6.5 Rmarkdown

  • Rmarkdown is a great way to write reports, papers, and presentations. It allows you to write text and code in the same document. You can also create tables, figures, and equations. The document you are reading right now is in fact an .rmd (Rmarkdown) file! Checkout the Rmarkdown cheatsheet for a quick reference guide.

    • Take a look at the Remedy package for some nice Rmarkdown shortcuts.
    • You can use rsam to then manage addins such as Remedy and others.
  • Cross-referencing in Rmarkdown is really useful for referencing figures, tables, and sections. This can be done multiple ways using the bookdown package. Check out how to cross-reference here, specifically the pandoc auto-naming conventions if you are going to reference using header identifiers. Some examples of cross referencing are below:

    • To reference a figure: Figure \@ref(fig:label)
    • To reference a table: Table \@ref(tab:label)
    • To reference a section: Section \@ref(label)
    • To reference a section: [heading name]
    • Example: Section \@ref(git-and-github) makes Section 5
    • Example: [Git and Github] makes Git and Github
  • Super important to not use _ in your chunk labels, as it will not reference properly. Use - instead.

6.6 Dataframes

  • You can get lots of info about the structure of a dataframe just by looking at the details in the environment pane. For example, # of obs = number of rows, # variables = number of columns

  • When dataframes or tables have many columns (more than 50) Rstudio will only show you 50 at a time, so to see the rest you need to use the arrows located on the top left of the df/table view.

  • Hovering your mouse over the df will tell you what class the object is, in this example bcfishpass is of class sf (simple feature)

  • You can also check the class using:
class(bcfishpass)
  • You can glimpse the columns using
str(bcfishpass)

or

head(bcfishpass)
  • You can use the waldo package to compare columns
waldo::compare(bcfishpass$aggregated_crossings_id, bcfishpass$stream_crossing_id)

gives: old is a character vector ('1005400575', '1005400576', '1005400577', '1005400578', '1005400579', ...) new is an integer vector (NA, NA, NA, NA, NA, ...)

6.7 Writing functions

  • Turning scripts into function for workflows that we use often can be very helpful for reproducibility and efficiency. Please read through writing good functions before moving on. Here are some additional things to remember when writing functions:

    • Add roxygen2 doccumentation to your function.
    • Name the function with a t (test) prefix to indicate that the function is for still in the testing phase, until it gets pulled into fpr. For example tfpr_db_query().
    • Call functions explicitly using ::, for example dplyr::mutate() and make sure all functions are included using the @importFrom syntax.
    • Include safety checks. You can use chk.
    • Define default values for parameters, even if they are = NULL.
    • Keep pipes simple by calling functions once and using , to separate variables. For example, put all case_when() calls in one mutate() call instead of multiple.
    • If certain numbers are being repeated in the function, consider defining them as params at the top of the function.
    • Avoid using getwd() because for most of our projects the working directory in a project is the main directory of the project.
    • Avoid using data.table functions, and instead use stringr.
  • Here are the steps for testing your function once it is written:

    • add your script as an R file in R directory.
    • run devtools::document().
    • run the devtools::check() or check from the build window, see 6.1. You need to have no errors to proceed.
    • run Test in that same window. Tests must pass to proceed.
    • add a bit of detail to news.md in main directory
    • run Install/Clean and Install, see 6.1.
    • Restart R in a repo outside of fpr. Load with library - Test it.
knitr::include_graphics("fig/build_check.png")
Using the `devtools::check` function

Figure 6.1: Using the devtools::check function