Introduction to DPLYR with National Park Visitation Data (Solution)

dplyr
exercise
solution
Published

February 26, 2024

Solution

Download as R Script

Exercise Without Solutions

Load National Park Visitation data

Code
np_data <- read.csv("https://raw.githubusercontent.com/melaniewalsh/Neat-Datasets/main/1979-2020-National-Park-Visits-By-State.csv",
 stringsAsFactors = FALSE)

View the np_data dataframe by clicking on the spreadsheet icon in the Global Environment

Install tidyverse

Code
install.packages("tidyverse")

Load dplyr library

Code
library(dplyr)

Exercise 1

Select 2 columns from the data using a DPLYR function.
Save this 2-column dataframe to the variable smaller_df.
Make sure to use the pipe %>% operator!

Code
smaller_df <- np_data %>% select(Year, RecreationVisits)

head(smaller_df)
Year RecreationVisits
1979 2787366
1980 2779666
1981 2997972
1982 3572114
1983 4124639
1984 3734763

Question: How does the number of visits to Washington national parks compare to another state?

Exercise 2

Filter the dataframe for only values in the state of Washington and save to the variable wa_parks

Code
wa_parks <- np_data %>% filter(State == "WA")

head(wa_parks)
ParkName Region State Year RecreationVisits
Mount Rainier NP Pacific West WA 1979 1516703
Mount Rainier NP Pacific West WA 1980 1268256
Mount Rainier NP Pacific West WA 1981 1233671
Mount Rainier NP Pacific West WA 1982 1007300
Mount Rainier NP Pacific West WA 1983 1106306
Mount Rainier NP Pacific West WA 1984 1152411

Exercise 3

Calculate the sum total of RecreationVisits to Washington by using summarize() on the smaller dataframe wa_parks

Code
wa_parks %>% summarize(sum(RecreationVisits))
sum(RecreationVisits)
188798152

Exercise 4

Filter the dataframe for only values in another state (your choice) and save to a variable. Calculate the sum total of RecreationVisits to this state by using summarize().

Code
ca_parks <- np_data %>% filter(State == "CA")
ca_parks %>% summarize(sum(RecreationVisits))
sum(RecreationVisits)
367814229

Question: How do the number of visits to these 2 states compare to one another?

Code
(wa_parks %>% summarize(sum(RecreationVisits))) -
ca_parks %>% summarize(sum(RecreationVisits))
sum(RecreationVisits)
-179016077