Introduction to

class: center, middle, inverse, title-slide

.title[
# Introduction to <img src="img/r-logo.png" width="200" />
]
.author[
### Jason Thomas
]
.institute[
### R Working Group
]
.date[
### Sept. 9th, 2023
]

---

# Welcome to the R Working Group

* Website [https://buckipr.github.io/R_Working_Group/](https://buckipr.github.io/R_Working_Group/)

* We are Slackers (email Jason at thomas.3912 for more details)

* Plan for this semester: build R into your workflow (even if not for analysis)

* Plan for next semester?

---
# Goals for this session

* Learn about...

+ basic R syntax
    
    + different R objects (things that hold data) & **indexing** them
    
    + useful functions for working with data

* Become familiar with [R Studio](https://posit.co/download/rstudio-desktop/) & 
  develop good coding habits 
  
    * R Studio is an *additional* program that provides many useful features
    for working with R
    
    * (you need to download and install both [R](https://cran.r-project.org/) and 
    [R Studio](https://posit.co/download/rstudio-desktop/))

---
class: inverse, center, middle

# R Studio

---
# R Studio

* Let's dive in by starting R Studio and opening a new R script

+ menu bar: &nbsp; `File` &rarr; `New File` &rarr; `R Script`
    + (in R: &nbsp; `File` &rarr; `New Script`)

* You should now have 4 panes open (like on the next slide)

+ **Source** -- Our script where we will type and save our comments & commands
    + **Console** -- Where we can give R commands and where the output will appear
    + **Output** -- File explorer, plots, help files, and more!
    + **Environments** -- Useful information about the R session

---
.center[<img src="img/rstudio-panes-labeled.jpeg" style="width: 75%" />]

.center[.bottom[downloaded from [user guide on postit.co](https://docs.posit.co/ide/user/ide/guide/ui/ui-panes.html)]]

---
# R Studio: Good Habits

* Add a comment to our new script:
    
    ```r
    # Comment: My R script from Working Group Session (1/20/2023)
    # (R ignores all lines that begin with a pound/hash/number sign/#)
    ```

* Save our script
    + menu bar: &nbsp; `File` &rarr; `Save As...`

* Set our **working directory**
 + this is where R will start looking for & saving files (e.g., data files or plots)
 + menu bar: &nbsp; `Session` &rarr; `Set Working Directory` &rarr; 
 &emsp; &emsp; &emsp; &emsp; `Choose Directory...`

---
class: inverse, center, middle

# Basic R Syntax

---
# Basic R Syntax

* R syntax takes the form

```r
# object_name <- object_value 
mean_age <- 33
```

* The symbol "`<-`" is called the assignment operator

+ we are creating a new variable called `mean_age` and assigning it a value of 33

+ `mean_age = 33` will also work (but `<-` is the convention)

---
class: slide-font-25
# Basic R Syntax (cont.)

If we enter the name of a variable in the `Console`, then R will list the value(s)

```r
> Mean_age2 <- 22 ## note: object names are case-sensitive
> Mean_age2
```

```
## [1] 22
```

BUT we are in the business of good habits...

* type this syntax into our script and (with the cursor on the same line) press the following keys together:

+ On a Mac: &nbsp; `<command> <enter>`
 
 + In Windows: &nbsp; `<control> <enter>` &emsp; (in R Studio) 
 &emsp; &emsp; &emsp; &emsp; &emsp; `<control> r` &emsp; &emsp; &emsp; &ensp; (in the R app)

* these keyboard shortcuts will run the syntax on the line in the `Console` 
(or you can highlight a region)

---
class: slide-font-25
# Basic R Syntax: functions

We have seen a simple object for holding data, but R has many useful **functions**

```r
ls()                         # list all the objects in memory
rm(Mean_age2)                # remove the object called Mean_age2
getwd()                      # print the working directory (wd)
setwd("Thesis/Analysis")     # set the wd to the folder Thesis/Analysis
dir()                        # list the files in the current directory
dir("../")                   # list the files in the parent directory
save.image("my_data.RData")  # save all the objects in memory
load("my_data.RData")        # load all the objects in the data file
```

*Quick note*:

* suppose you create an object called `abc` that holds the value 2
* then you load `data.RData` that also has an object named `abc` but holds the value 99
* the first version of the object (`abc` holding 2) will get replaced

---
# Basic R Syntax: help files

* Google searches are a very effective way to find help

+ and so is asking the R Working Group 😎

* R documentation can be accessed in the `Help` tab in the `Output` pane

* Some additional syntax and functions

```r
?read.csv                     # show the help file for the function read.csv
help.search("weighted mean")  # search help files for the phrase'weighted mean'
```

---
class: inverse, center, middle

# Data Structures in R

---

## **Data Structures**: motivation

We are not going to solve the world's problems with a single number...

```r
> all_ages <- c(22, 33, 44, 55) # c() concatenates numbers together
> all_ages
```

```
## [1] 22 33 44 55
```

```r
> mean(all_ages)                 # calculate the mean
```

```
## [1] 38.5
```

```r
> all_ed <- c("HS", "Col", "Grad Sch", "HS")
> all_ed
```

```
## [1] "HS"       "Col"      "Grad Sch" "HS"
```

---
## **Data Structures**: motivation (cont.)

R handles different *types* of data as well

```r
> important_data <- c("OSU", "R", "Group", 4)
> important_data
```

```
## [1] "OSU"   "R"     "Group" "4"
```

Wait, what is going on here?

* we are mixing different types of data & R assumes that we just forgot to
wrap the 4 in quotation marks
    
* sometimes R's assumptions are useful, sometimes they are not! 🤔

---
## **Data Structures**: motivation (cont.)

Here is another example with missing data

```r
> test_scores <- c(88, 99, 110, 66, NA) # NA is for missing values
> mean_scores <- mean(test_scores)
> mean_scores / 100
```

```
## [1] NA
```

😾 Ugh! Why didn't R tell me there was a problem when I tried to calculate the mean?!?

* another R assumption
    
* can you figure out how to calculate the mean for non-missing values? (help file
is helpful 😄)

---
## **Data Structures**: vectors

* We have been creating **vectors** when we use `c()` to concatenate data

* Here are some more useful functions for working with vectors

```r
> # test that we have a vector
> is.vector(test_scores)  # returns another data type: TRUE or FALSE (called logical)
```

```
## [1] TRUE
```

```r
> summary(test_scores)    # numerical summary (less helpful for strings)
```

```
##    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
##   66.00   82.50   93.50   90.75  101.75  110.00       1
```

---
## **Data Structures**: vectors (cont.)

```r
> length(test_scores)     # how many elements in the vector
```

```
## [1] 5
```

```r
> is.na(test_scores)      # test if each element is NA
```

```
## [1] FALSE FALSE FALSE FALSE  TRUE
```

```r
> TRUE + TRUE + FALSE     # useful trick with logical objects (TRUE/FALSE)
```

```
## [1] 2
```

```r
> n_missing <- sum(is.na(test_scores))
> n_missing
```

```
## [1] 1
```

---
## **Data Structures**: indexing vectors

We can access the `$i^{th}$` element in a vector with the syntax `vector_name[ i ]`

```r
> test_scores[1]    # first element
```

```
## [1] 88
```

```r
> test_scores[2]    # second element
```

```
## [1] 99
```

```r
> 1:3   # a vector of c(1, 2, 3)
```

```
## [1] 1 2 3
```

```r
>       # so what will test_scores[3:1] give us?
```

---
## **Data Structures**: indexing vectors (cont.)

The syntax &ensp; `3:1` &ensp; gives the vector &ensp; `c(3, 2, 1)`, so...

```r
> test_scores[3:1]  # returns 3rd element, then the 2nd, then the first
```

```
## [1] 110  99  88
```

```r
> test_scores       # sanity check
```

```
## [1]  88  99 110  66  NA
```

* So what will the following command do? 🤔

```r
test_scores[c(3, 5, 11)]
```

---
## **Data Structures**: changing vectors

We can use indexing to change vectors as well, e.g., reassign the first element

```r
> test_scores[1] <- NA # change the first element to NA
> test_scores[1]
```

```
## [1] NA
```

Again, we can use vectors to index as well:

```r
index_missing_scores <- is.na(test_scores) # create an index vector of TRUE & FALSE
test_scores[index_missing_scores] <- -99 # change NA to -99
```

Let's walk through this... 
(🦉 but note a good habit would be to create a new vector,
`new_test_scores`, so we can retain the original data!)

---
class: slide-font-25
## **Data Structures**: changing vectors (cont.)

```r
> # create an index vector of TRUE & FALSE
> index_missing_scores <- is.na(test_scores)
> index_missing_scores
```

```
## [1]  TRUE FALSE FALSE FALSE  TRUE
```

```r
> # attach these 2 vectors together as columns
> cbind(index_missing_scores, test_scores)
```

```
##      index_missing_scores test_scores
## [1,]                    1          NA
## [2,]                    0          99
## [3,]                    0         110
## [4,]                    0          66
## [5,]                    1          NA
```

* with &nbsp; `cbind` &nbsp; we are actually creating a new **data structure** called a **matrix**

* as we will see, matrices can only hold the same *data type*, so R changes `TRUE`/`FALSE`
to `1`/`0` (respectively)

---
## **Data Structures**: changing vectors (cont.)

```r
> test_scores[index_missing_scores]  #  access all of the indices with TRUE 
```

```
## [1] NA NA
```

```r
> # recode NA to -99
> test_scores[index_missing_scores] <- -99
> test_scores
```

```
## [1] -99  99 110  66 -99
```

---
## **Data Structures**: changing vectors (recap)

When you want to change a vector, do the *delta 2-step*:

1. create an index vector that identifies the elements you want to change

* what data type should this vector hold?
    * `logical`, i.e. `TRUE`s and `FALSE`s

2. assign new values to the vector using your vector of indices

---
## **Data Structures**: vector recap

* We are not going to become 💰 famous 💰 by working with
a single vector

* However, we have learned a powerful way to work with vectors, **indexing**, that extends to
other types of **data structures**

* A **matrix** made a brief appearance earlier, but before going further let's review a useful framework
for thinking about **data structures**

---
## **Data Structures**: overview

R has different structures for holding data, which can be 
organized by...

1. How many dimensions does the structure have?

2. Do the types of data need to be the same?

* Example: **vectors**

+ only 1 dimension (it is just a single row or a column)
    + we saw earlier that R changes the elements so they all have the same data type (e.g., `4` &rarr; `"4"`)

* We'll now (re)introduce different data structures, and learn about
different data types along the way.

---

## **Data Structures**: overview (cont.)

* **Vectors**
  1. 1 dimension
  1. same data type
    + special case: **factor** (predefined categories)

* **Matrices**
  1. rows and columns
  1. same data type

* **Arrays** 
  1. any number of dimensions
  1. same data type

---

## **Data Structures**: overview (cont.)

* **Data Frames**
  1. rows and columns
  1. different data types
  - particularly useful for holding a data set with quantitative & qualitative variables

* **Lists**
  1. 1 dimension
  1. different data types (or structures!)
  - actually, this is just a special type of vector (can you verify this?)

---
## **Data Structures**: working with data frames

* For the rest of this session we will focus on **Data frames**, the R structure
typically used for data sets (i.e., variables as columns and an observation for each row).

* Let's get some practice working with data frames using one
of R's example data sets

```r
> data(mtcars)            ## load one of R's example data sets mtcars
> ls()
```

```
## [1] "all_ages"             "all_ed"               "important_data"      
## [4] "index_missing_scores" "Mean_age2"            "mean_scores"         
## [7] "mtcars"               "n_missing"            "test_scores"
```

```r
> is.data.frame(mtcars)   ## check that mtcars is a data frame
```

```
## [1] TRUE
```

---
## **Data Structures**: reading in data sets

Before we proceed with `mtcars`, a quick example of how to read in a data set.

```r
> # write data to a CSV file called 'copy_mtcars.csv' in the working directory
> write.csv(mtcars, "copy_mtcars.csv") 
> mtcars2 <- read.csv("copy_mtcars.csv") # load data set from CSV file
> ls()
```

```
##  [1] "all_ages"             "all_ed"               "important_data"      
##  [4] "index_missing_scores" "Mean_age2"            "mean_scores"         
##  [7] "mtcars"               "mtcars2"              "n_missing"           
## [10] "test_scores"
```

```r
> is.data.frame(mtcars2)
```

```
## [1] TRUE
```

---
## **Data Structures**: exploring data frames

* Since **data frames** have 2 dimensions, the index requires 2 pieces of
info: `[row index, column index]`

```r
> dim(mtcars)
## [1] 32 11
> mtcars[1, 1]  # 1st observation in 1st variable
## [1] 21
```

* Many times, however, we just work with one variable/column at a time, so all our skills
working with vectors still apply

```r
> # if we leave out the row part of the address, we get all rows and a vector
> is.vector(mtcars[, 1])
```

```
## [1] TRUE
```

---
## **Data Structures**: `dplyr`

* `dplyr` is part of [`tidyverse`](https://www.tidyverse.org/)

+ `ggplot2`, `forcats`, `tibble`, `readr`, `stringr`,  `tidyr`, `purrr`
  + may also want to check out [`tidycensus`](https://walker-data.com/tidycensus/articles/basic-usage.html)

* `dplyr` logic: "By constraining your options, it helps you think about your data manipulation challenges."

+ 5 commands will take you a long way
  + readability and simplifying code (with pipes)

```r
> install.packages("dplyr")  ## only run once (not in script)
> library(dplyr)
```

---
class: slide-font-25
## **Data Structures**: `dplyr` arrange rows

```r
> # only look at a few columns
> names(mtcars)
> mtcars %>% 
>   select(mpg, cyl) %>%
>   arrange(mpg, desc(cyl))
```

```
##  [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"   "gear"
## [11] "carb"
```

```
##                      mpg cyl
## Cadillac Fleetwood  10.4   8
## Lincoln Continental 10.4   8
## Camaro Z28          13.3   8
## Duster 360          14.3   8
## Chrysler Imperial   14.7   8
## Maserati Bora       15.0   8
## Merc 450SLC         15.2   8
## AMC Javelin         15.2   8
## Dodge Challenger    15.5   8
## Ford Pantera L      15.8   8
```
(truncated output)

---
## **Data Structures**: `dplyr` filter row

```r
> # only look at a few rows
> mtcars %>% 
+   select(mpg, cyl) %>%
+   filter(cyl == 6)
```

```
##                 mpg cyl
## Mazda RX4      21.0   6
## Mazda RX4 Wag  21.0   6
## Hornet 4 Drive 21.4   6
## Valiant        18.1   6
## Merc 280       19.2   6
## Merc 280C      17.8   6
## Ferrari Dino   19.7   6
```

keyboard shortcuts in RStudio for the pipe (`%>%`)
 
 + MacOS: &nbsp; `<command> <shift> M`
 + Windows: &nbsp; `<control> <shift> M`

---
## **Data Structures**: `dplyr` filter more rows

```r
> # only look at a few rows
> mtcars %>% 
+   select(mpg, cyl) %>%
+   filter(cyl > 4 & mpg > 22)
```

```
## [1] mpg cyl
## <0 rows> (or 0-length row.names)
```

---
## **Data Structures**: `dplyr` filter more rows

```r
> # only look at a few rows
> mtcars %>% 
+   select(mpg, cyl) %>%
+   filter(cyl > 4 & mpg > 18)
```

```
##                    mpg cyl
## Mazda RX4         21.0   6
## Mazda RX4 Wag     21.0   6
## Hornet 4 Drive    21.4   6
## Hornet Sportabout 18.7   8
## Valiant           18.1   6
## Merc 280          19.2   6
## Pontiac Firebird  19.2   8
## Ferrari Dino      19.7   6
```

---
## **Data Structures**: `dplyr` take a slice

```r
> # only look at a few rows
> mtcars %>% 
+   select(mpg, cyl) %>%
+   slice(grep("Mazda", row.names(mtcars)))
```

```
##               mpg cyl
## Mazda RX4      21   6
## Mazda RX4 Wag  21   6
```

(also look at [`stringr`](https://stringr.tidyverse.org/) package)

---
## **Data Structures**: `dplyr` take another slice

```r
> # only look at a few rows
> mtcars %>% 
+   select(mpg, cyl) %>%
+   slice(c(1, 9, 20))
```

```
##                 mpg cyl
## Mazda RX4      21.0   6
## Merc 230       22.8   4
## Toyota Corolla 33.9   4
```

---
class: slide-font-25
## **Data Structures**: `dplyr` make new column

```r
> # create new column named mpg2
> mtcars %>% 
>   select(mpg, cyl) %>%
>   mutate(mpg2 = mpg/1000)
```

```
##                    mpg cyl   mpg2
## Mazda RX4         21.0   6 0.0210
## Mazda RX4 Wag     21.0   6 0.0210
## Datsun 710        22.8   4 0.0228
## Hornet 4 Drive    21.4   6 0.0214
## Hornet Sportabout 18.7   8 0.0187
## Valiant           18.1   6 0.0181
## Duster 360        14.3   8 0.0143
## Merc 240D         24.4   4 0.0244
## Merc 230          22.8   4 0.0228
## Merc 280          19.2   6 0.0192
## Merc 280C         17.8   6 0.0178
## Merc 450SE        16.4   8 0.0164
```
(truncated output)

---
class: slide-font-25
## **Data Structures**: exploring data frames

* And now, some Old School techniques for working with data frames
* Access a single column in a data frame is to use `$`

```r
> names(mtcars)  ## print the variable names
> mtcars$mpg     ## return the mpg variable
```

```
##  [1] "mpg"  "cyl"  "disp" "hp"   "drat" "wt"   "qsec" "vs"   "am"  
## [10] "gear" "carb"
```

```
##  [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3
## [14] 15.2 10.4 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3
## [27] 26.0 30.4 15.8 19.7 15.0 21.4
```

* Now we will (re)introduce several functions for exploring data frames
* We will also see a more advanced example of indexing

---
## **Data Frames**: exploring columns (cont.)

```r
> dim(mtcars)    ## print the number of rows and columns
```

```
## [1] 32 11
```

```r
> str(mtcars)    ## print structure of data frame
```

```
## 'data.frame':	32 obs. of  11 variables:
##  $ mpg : num  21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
##  $ cyl : num  6 6 4 6 8 6 8 4 4 6 ...
##  $ disp: num  160 160 108 258 360 ...
##  $ hp  : num  110 110 93 110 175 105 245 62 95 123 ...
##  $ drat: num  3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
##  $ wt  : num  2.62 2.88 2.32 3.21 3.44 ...
##  $ qsec: num  16.5 17 18.6 19.4 17 ...
##  $ vs  : num  0 0 1 1 0 1 0 1 1 1 ...
##  $ am  : num  1 1 1 0 0 0 0 0 0 0 ...
##  $ gear: num  4 4 4 3 3 3 3 4 4 4 ...
##  $ carb: num  4 4 1 1 2 1 4 2 2 4 ...
```

---
## **Data Frames**: summarizing columns

```r
> summary(mtcars)
```

```
##       mpg             cyl             disp             hp       
##  Min.   :10.40   Min.   :4.000   Min.   : 71.1   Min.   : 52.0  
##  1st Qu.:15.43   1st Qu.:4.000   1st Qu.:120.8   1st Qu.: 96.5  
##  Median :19.20   Median :6.000   Median :196.3   Median :123.0  
##  Mean   :20.09   Mean   :6.188   Mean   :230.7   Mean   :146.7  
##  3rd Qu.:22.80   3rd Qu.:8.000   3rd Qu.:326.0   3rd Qu.:180.0  
##  Max.   :33.90   Max.   :8.000   Max.   :472.0   Max.   :335.0  
##       drat             wt             qsec             vs        
##  Min.   :2.760   Min.   :1.513   Min.   :14.50   Min.   :0.0000  
##  1st Qu.:3.080   1st Qu.:2.581   1st Qu.:16.89   1st Qu.:0.0000  
##  Median :3.695   Median :3.325   Median :17.71   Median :0.0000  
##  Mean   :3.597   Mean   :3.217   Mean   :17.85   Mean   :0.4375  
##  3rd Qu.:3.920   3rd Qu.:3.610   3rd Qu.:18.90   3rd Qu.:1.0000  
##  Max.   :4.930   Max.   :5.424   Max.   :22.90   Max.   :1.0000  
##        am              gear            carb      
##  Min.   :0.0000   Min.   :3.000   Min.   :1.000  
##  1st Qu.:0.0000   1st Qu.:3.000   1st Qu.:2.000  
##  Median :0.0000   Median :4.000   Median :2.000  
##  Mean   :0.4062   Mean   :3.688   Mean   :2.812  
##  3rd Qu.:1.0000   3rd Qu.:4.000   3rd Qu.:4.000  
##  Max.   :1.0000   Max.   :5.000   Max.   :8.000
```

---
## **Data Frames**: exploring columns (cont.)

An alternative ways to access a data frame's variable(s):

```r
> mtcars[["mpg"]]
```

```
##  [1] 21.0 21.0 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 17.8 16.4 17.3
## [14] 15.2 10.4 10.4 14.7 32.4 30.4 33.9 21.5 15.5 15.2 13.3 19.2 27.3
## [27] 26.0 30.4 15.8 19.7 15.0 21.4
```

```r
> mtcars[1:10, c("mpg", "cyl")]
```

```
##                    mpg cyl
## Mazda RX4         21.0   6
## Mazda RX4 Wag     21.0   6
## Datsun 710        22.8   4
## Hornet 4 Drive    21.4   6
## Hornet Sportabout 18.7   8
## Valiant           18.1   6
## Duster 360        14.3   8
## Merc 240D         24.4   4
## Merc 230          22.8   4
## Merc 280          19.2   6
```

---
## **Data Frames**: creating new variables

```r
> mtcars$mpg_squared <- mtcars$mpg * mtcars$mpg
> mtcars[1:10, c("mpg", "mpg_squared")]
```

```
##                    mpg mpg_squared
## Mazda RX4         21.0      441.00
## Mazda RX4 Wag     21.0      441.00
## Datsun 710        22.8      519.84
## Hornet 4 Drive    21.4      457.96
## Hornet Sportabout 18.7      349.69
## Valiant           18.1      327.61
## Duster 360        14.3      204.49
## Merc 240D         24.4      595.36
## Merc 230          22.8      519.84
## Merc 280          19.2      368.64
```

---
## **Data Frames**: more on indexing

When creating an index, we can also use multiple conditions

* to satisfy BOTH conditions use `&` (and)
 * to satisfy EITHER condition use `|` (or)

```r
> mtcars[mtcars$mpg < 25 & mtcars$mpg > 21, c("mpg", "cyl")]
```

```
##                 mpg cyl
## Datsun 710     22.8   4
## Hornet 4 Drive 21.4   6
## Merc 240D      24.4   4
## Merc 230       22.8   4
## Toyota Corona  21.5   4
## Volvo 142E     21.4   4
```

---
## **Data Frames**: more on indexing (cont.)

(remember: variables are just vectors, so we can use what we learned earlier)

```r
> cbind(mtcars$mpg, mtcars$mpg < 15 | mtcars$mpg > 20)[1:10,]
```

```
##       [,1] [,2]
##  [1,] 21.0    1
##  [2,] 21.0    1
##  [3,] 22.8    1
##  [4,] 21.4    1
##  [5,] 18.7    0
##  [6,] 18.1    0
##  [7,] 14.3    1
##  [8,] 24.4    1
##  [9,] 22.8    1
## [10,] 19.2    0
```

---
## **Data Frames**: more on indexing (cont.)

And we can use multiple variables

```r
> table(mtcars$mpg > 30 & mtcars$cyl == 6)
```

```
## 
## FALSE 
##    32
```

```r
> table(mtcars$mpg > 30 & mtcars$cyl == 4)
```

```
## 
## FALSE  TRUE 
##    28     4
```

---
## **Data Frames**: final indexing example

```r
> hi_mpg <- mtcars$mpg > mean(mtcars$mpg)
> hi_cyl <- mtcars$cyl == 4
> table(hi_mpg, hi_cyl)
```

```
##        hi_cyl
## hi_mpg  FALSE TRUE
##   FALSE    18    0
##   TRUE      3   11
```

---
## **Data Frames**: final indexing example (cont.)

```r
> mtcars$good_car <- FALSE
> mtcars$good_car[hi_mpg & hi_cyl] <- TRUE
> table(mtcars$good_car)
```

```
## 
## FALSE  TRUE 
##    21    11
```

---
## **Data Frames**: final indexing example (cont.)

Sanity check

```r
> # cbind(mtcars$good_car, hi_mpg, hi_cyl, mtcars$mpg, mtcars$cyl)
> cbind(mtcars$good_car, hi_mpg, hi_cyl)[1:15,]
```

```
##             hi_mpg hi_cyl
##  [1,] FALSE   TRUE  FALSE
##  [2,] FALSE   TRUE  FALSE
##  [3,]  TRUE   TRUE   TRUE
##  [4,] FALSE   TRUE  FALSE
##  [5,] FALSE  FALSE  FALSE
##  [6,] FALSE  FALSE  FALSE
##  [7,] FALSE  FALSE  FALSE
##  [8,]  TRUE   TRUE   TRUE
##  [9,]  TRUE   TRUE   TRUE
## [10,] FALSE  FALSE  FALSE
## [11,] FALSE  FALSE  FALSE
## [12,] FALSE  FALSE  FALSE
## [13,] FALSE  FALSE  FALSE
## [14,] FALSE  FALSE  FALSE
## [15,] FALSE  FALSE  FALSE
```

---
## **Recap & Moving Forward**

* You should now be familiar with a few of R's data structures

+ (and for knowing when they should be used: # of dimensions & data types)
  
* We have also been introduced to some useful functions for manipulating, summarizing,
and exploring data

+ There are many more(!) and users contribute **R packages** that implement a wide
  range of tools, models, and methods: [list of some packages on CRAN](https://cran.r-project.org/)

---
## **Recap & Moving Forward** (cont.)

* R comes installed with many packages that you can explore & access with the `library()`
function

```r
# library()              # list all the packages installed on your computer
library(stats)           # load the stats package
# help(package="stats")  # look at the package documentation
```

* In future session, we will explore some of these packages that are particularly useful
for

+ sending results to paper: [rmarkdown](https://rmarkdown.rstudio.com/)
    + making maps: [sf](https://r-spatial.org/book/07-Introsf.html)

* Please join us 😄