Day One:
Basics of R/Rstudio, projects, and working with data files

~30 min

Overview

Questions

  • What can I actually do with R?
  • How can I start working in R from scratch?
  • What are and how can I use extension packages in R?

Lesson Objectives

To be able to do/use

  • Basic operations in R (math, logic, assign, subsetting).
  • install, and attach packages, and getting help

R basics

Language [side-by-side]

Comments #
# R will ignore this!!
1 + 1 # +1 also this!!
[1] 2
Assignment <-
two <- 1 + 1
two
[1] 2

Important

Var names can contain letters, numbers, underscores and periods only. They must start with a letter.

functions fun(arg = val)
# Use `=` to assign vals to args
mean(x = c(1, 2, 3))
[1] 2
Help function ?fun
Help operator ?"<op>"
?sum
?"+"

Create vectors c
three_two_one <- c(3, two, 1)
three_two_one
[1] 3 2 1
List objects ls
ls()
[1] "has_annotations" "three_two_one"   "two"            
Remove objects rm
rm(three_two_one)
ls()
[1] "has_annotations" "two"            
R’s null object NULL
R’s missing object NA
NULL
NA
NULL
[1] NA

Math [overview - side-by-side]

Brackets (, )
Add +
Subtract -
Multiply *
Divide /
Exponent ^ or **
Square root sqrt




# Standard order of precedence
5 + sqrt(4) / 2 * 3^(2 - 1)
[1] 8
Magnitude <num>E<exp10>
1E2
[1] 100
Exponential exp
# Euler number
exp(1)
[1] 2.718282
Logarithm log, log10, log2
# You can compose functions
log(exp(1))
[1] 1
Pi pi
pi
[1] 3.141593
Sine sin
sin(pi/2)
[1] 1
Cosine cos
cos(pi)
[1] -1

Logic and tests [overview - side-by-side]

True TRUE
False FALSE
And &
Or |
Not !


# Standard order of precedence
TRUE & FALSE
[1] FALSE
# Standard order of precedence
!TRUE | TRUE
[1] TRUE
!(TRUE | TRUE)
[1] FALSE
Comparison <, <=, >, >=
sum(1:3) > 4
[1] TRUE
“Exact!” Equal ==
(1 + 2) == 3
[1] TRUE
Test equal numbers all.equal
3/5 == 0.6
[1] TRUE
3*(1/5) == 0.6 # finite machine
[1] FALSE
all.equal(3*(1/5), 0.6)
[1] TRUE
Different !=
1 != 2
[1] TRUE

Packages

A package is a container of functions and data sets.

A library is a folder in your computer that store packages. They can be of three type:

  1. project (may exists, powered by {renv} package… we will see that)
  2. user (may exists)
  3. system (always exists)
Install packages ?install.packages
install.packages("tibble")
Attach packages ?library
library("tibble")
Help1 help(package = "<pkg_name>")
help(package = "tibble")

Important

Install a package stores its R executable code in a library.
Attach a package let you able to use its functions and dataset within the current R session.

You need to install a package you want to use once only.
All the time you (re)start an R session, you need to library(<pkg>) again.

Important

Functions and data are identified by their names (which cannot be duplicate within the same package).

Attaching a package to an R session, it means to make its function and data names available for you to call/use. I.e., you have expanded the language (within the session)!

The Tidyverse

The tidyverse is an opinionated collection of R packages designed for data science. All packages share an underlying design philosophy, grammar, and data structures.


Design Principles

  • Reuse existing data structures.
  • Compose simple functions with the pipe.
  • Embrace functional programming.
  • Design for humans.

Packages

  • {ggplot2}: create data visualizations using the Grammar of Graphics.
  • {dplyr}: a set of verbs to solve data manipulation challenges.
  • {tidyr}: a set of functions that help you get to tidy data.
  • {readr}: a fast and friendly way to read rectangular data
  • {purrr}: a consistent tools suite for functional programming.
  • {tibble}: a modern re-imagining of the data frame.
  • {stringr}: a set of functions to work with strings as easy as possible.
  • {forcats}: a tool suite that solves common problems with factors.
  • {lubridate}: a tool suite to work with dates and date-times.

Your turn (main: B; bk1: C; bk2: A)

Your turn

Connect to the Day-1 project in RStudio cloud (https://bit.ly/ubep-rws-rstudio)

  1. Create a new script and write the following code in it. Save the script. (install the mentioned package first)

  2. Run the script and check the result in the console.

  3. Open the help page of the tibble package and read it; focus more on the structure of the help page instead of the content.

Groups

  • Main room: group B
  • Brakeout room 1: group C
  • Brakeout room 2: group A
library(tidyverse)

db_tbl <- tibble(
  age = c(70, 85, 69),
  height = c(1.5, 1.72, 1.81),
  at_risk = c(TRUE, FALSE, TRUE),
  gender = factor(
    c("male", "female", "female"),
    levels = c("female", "male", "other")
  )
)
db_tbl
15:00

My turn

YOU: Connect to our pad(https://bit.ly/ubep-rws-pad-3ed) and write there questions & doubts (and if I am too slow or too fast)

ME: Connect to the Day-1 project in RStudio cloud (https://bit.ly/ubep-rws-rstudio): script 05-packages.R

Acknowledgment

To create the current lesson we explored, use, and adapt contents from the following resources:

The slides are made using Posit’s Quarto open-source scientific and technical publishing system powered in R by Yihui Xie’s kintr.

Additionl Resources

License

This work by Corrado Lanera, Ileana Baldi, and Dario Gregori is licensed under CC BY 4.0