Day Two:
Infrastructures

~20 min

Overview

Questions

  • What is a library (and a repository) of R packages, and how many of them we (can) have on a system?
  • What is renv, and why it is crucial for mid- long-term projects?
  • How can we activate and use renv on our R projects?

Lesson Objectives

To be able to

Local environments - {renv}

Glossary

  • Package: a container for functions and data

  • Library: a folder on a computer in which installed packages are stored

  • Repository: a source of packages (often on Internet)

Reminder

Functions and data are identified by their names (which cannot be duplicate within the same package).

When you library() a package you attach it to the current R session.

Attaching a package to the R session, it means to make it function and data names available for you to use. I.e., you are (within the session) expanded the language!

Repositories [optional]

The most common repository is CRAN (The Comprehensive R Archive Network) from which you can install packages from any R session by install.packages().

Other repositories are:

Tip

Call getOption("repos") to know which repository are you using in the session you are.

getOption("repos")
    CRAN 
"@CRAN@" 

Libraries [optional]

  • System: shared across all users and projects

  • User: shared by all the user projects

  • Project: powered by renv, is the project its own independent collection of packages

Tip

Call .libPaths() to know which library are you using in the session you are.

.libPaths()
[1] "C:/Users/corra/Documents/GitHub/ubep/2023-ecdc-rws/_day-two/renv/library/R-4.3/x86_64-w64-mingw32"
[2] "C:/Program Files/R/R-4.3.2/library"                                                               

Create projects powered by {renv} [side]

To create a project powered by renv you can simply tick the corresponding option in RStudio at creation time.

The {renv} main workflow

The renv workflow

  1. init(): setup the renv infrastructure
  2. install()/update()/install.packages(): install/update packages
  3. snapshot(): update the renv.lock file writing metadata about the current state of the project library
  4. restore(): restore the library accordingly to the what is prescribed in the renv.lock file

status(): check for differences between the renv.lock file and the packages installed in the project library.

Convert projects to use {renv}

To convert an existing project to use renv call renv::init().

This creates:

  • A renv/library folder that will be the projects library containing all the packages used within the project

  • renv.lock file which is the current package inventory of your projects, storing metadata of project used packages so that anyone can re-install them all (with exact the same version) on any other machine.

  • A project dedicated .Rprofile; which is an R script that is automatically run at every R start, just before to giving you the power to interact with the R session, and it is used by renv to configure the project library in the current session.

My turn

ME: Connect to the course-scripts project in RStudio cloud (https://bit.ly/ubep-rws-rstudio): script 06-renv.R

{renv} cache [optional]

Every renv projects starts with an empty library (there is only the renv package itself).

Working on many projects you will probably use same packages on different projects, so you will need to install them multiple times!

Important

Installing a package means:

  • download it from a repository
  • install (put) it in the project library

every time…

That is managed efficiently by a renv global cache, which permit to download and install a specific package (with a specific version) only once, so that installing the same package on multiple projects takes time only the first time, and it will be lighting fast in all the subsequent ones.

Your turn (main: A; bk1: B; bk2: C)

20:00

Your turn

  • Create a new project
  • Install renv and activate it
  • Install ggplot2, and here and update the renv.lock file
  • Create the folders R/, analyses/, and output/
  • Create a script R/functions.R and define a function make_plot() returning a ggplot2 plot of mtcars
  • Create a script analyses/analysis.R that:
  • attach the ggplot2 package
  • attach the here package
  • attach the functions.R script
  • calls the make_plot() function
  • saves the plot in the output/ folder
  • Run the whole analyses/analysis.R script.
  • Answer the question reported in the pad (https://bit.ly/ubep-rws-pad-ed3).
  • Within the R/ folder, create a script named functions.R and type in
make_plot <- function() {
  ggplot2::ggplot(
    mtcars,
    ggplot2::aes(x = cyl, y = mpg)
  ) +
    ggplot2::geom_point()
}
  • within the analyses/ folder, create a script analysis.R including the following code
library(here)
library(ggplot2)

source(here("R", "functions.R"))

plot <- make_plot()

ggsave(
  plot,
  here("output", "plot.png")
)

status(): check differences between renv.lock and packages installed in the project’s library.

My turn

YOU: Connect to our pad (https://bit.ly/ubep-rws-pad-ed3) and write there questions & doubts (and if I am too slow or too fast)

Acknowledgment

To create the current lesson, we explored, used, and adapted content from the following resources:

The slides are made using Posit’s Quarto open-source scientific and technical publishing system powered in R by Yihui Xie’s Kintr.

Additional resources

License

This work by Corrado Lanera, Ileana Baldi, and Dario Gregori is licensed under CC BY 4.0