Code
<- read.csv("https://raw.githubusercontent.com/melaniewalsh/Neat-Datasets/main/1979-2020-National-Park-Visits-By-State.csv", stringsAsFactors = FALSE) np_data
August 1, 2024
What is the average number of visits for each state?
Save as avg_state_visits
and then view the resulting dataframe.
Discuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?
What is the average number of visits for each National Park?
Save as avg_park_visits
and then view the resulting dataframe.
`summarise()` has grouped output by 'ParkName'. You can override using the
`.groups` argument.
Discuss/consider: Which National Park has the most and least average visits? What patterns or surprises do you notice?
How many National Parks are there in each state?
Save your answer as distinct_parks
.
Discuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?
---
title: "DPLYR Groupby with National Park Visitation Data (Solution)"
date: "2024-08-01"
categories: [dplyr, exercise, solution]
format:
html:
code-links:
- text: R Script
href: NP-Data-Groupby-Solutions.R
icon: file-code
code-overflow: wrap
code-fold: show
editor: visual
df-print: kable
R.options:
warn: false
code-tools: true
execute:
eval: true
---
# Load the data
```{r}
np_data <- read.csv("https://raw.githubusercontent.com/melaniewalsh/Neat-Datasets/main/1979-2020-National-Park-Visits-By-State.csv", stringsAsFactors = FALSE)
```
# Load dplyr library
```{r warning="ignore"}
library("dplyr")
```
# Exercise 1
What is the average number of visits for *each state*?
Save as `avg_state_visits` and then view the resulting dataframe.
```{r}
avg_state_visits <- np_data %>%
group_by(State) %>%
summarize(avg_visits = mean(RecreationVisits))
```
Discuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?
# Exercise 2
What is the average number of visits for *each National Park*?
Save as `avg_park_visits` and then view the resulting dataframe.
```{r}
avg_park_visits <- np_data %>%
group_by(ParkName, State) %>%
summarize(avg_visits = mean(RecreationVisits))
```
Discuss/consider: Which National Park has the most and least average visits? What patterns or surprises do you notice?
# Exercise 3:
How many National Parks are there in *each state*?
Save your answer as `distinct_parks`.
```{r}
distinct_parks <- np_data %>%
group_by(State) %>%
summarize(num_parks = n_distinct(ParkName))
```
Discuss/consider: What state has the most and least average visits? What patterns or surprises do you notice?