filter()
This lesson is called filter(), part of the Fundamentals of R course. This lesson is called filter(), part of the Fundamentals of R course.
Transcript
Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.
Loading transcript...
View code shown in video
# Load Packages -----------------------------------------------------------
library(tidyverse)
# Import Data -------------------------------------------------------------
penguins <-
read_csv("penguins.csv")
# filter() ----------------------------------------------------------------
# We use filter() to choose a subset of observations.
# We use == to select all observations that meet the criteria.
penguins |>
filter(species == "Adelie")
# We use != to select all observations that don't meet the criteria.
penguins |>
filter(species != "Adelie")
# We use filter_out() to do the same thing
penguins |>
filter_out(species == "Adelie")
# We can combine comparisons and logical operators.
penguins |>
filter(species == "Adelie" | species == "Chinstrap")
# We can use %in% to collapse multiple comparisons into one.
penguins |>
filter(species %in% c("Adelie", "Chinstrap"))
# We can also use when_any() to do the same thing
penguins |>
filter(when_any(species == "Adelie", species == "Chinstrap"))
# We can chain together multiple filter functions.
# Doing it this way, we don't have create complex logic in one line.
# Complicated version
penguins |>
filter(
species %in% c("Adelie", "Chinstrap") & island == "Torgersen"
)
# Simpler version
penguins |>
filter(species %in% c("Adelie", "Chinstrap")) |>
filter(island == "Torgersen")
# when_all() version
penguins |>
filter(
when_all(
species %in% c("Adelie", "Chinstrap"),
island == "Torgersen"
)
)
# We can use <, >, <=, and => for numeric data.
penguins |>
filter(body_mass_g > 4000)
# We can drop NAs with !is.na().
penguins |>
filter(!is.na(sex))
# But the double negative is confusing.
# We can also drop NAs with drop_na().
penguins |>
drop_na(sex)
Your Turn
# Load Packages -----------------------------------------------------------
# Load the tidyverse package
library(tidyverse)
# Import Data -------------------------------------------------------------
penguins <- read_csv("penguins.csv")
# filter() ----------------------------------------------------------------
# Use filter() to only keep female penguins
# YOUR CODE HERE
# Use filter() to only keep penguins NOT on Torgersen island
# YOUR CODE HERE
# Use filter() to only keep penguins on Torgersen island or Biscoe island
# Use the or logical operator (|) to do this
# YOUR CODE HERE
# Rewrite your filter() code above to keep the penguins from Torgersen island or Biscoe island
# This time, though, use the %in% operator
# YOUR CODE HERE
# Use a comparison operator to keep penguins with flipper lengths greater than or equal to 193 millimeters
# YOUR CODE HERE
# Drop any rows that have missing data in the flipper_length_mm variable
# Do this first with !is.na()
# YOUR CODE HERE
# Do this a second time with drop_na()
# YOUR CODE HERE
Learn More
To learn more about the filter() function, check out Chapter 3 of R for Data Science.
Have any questions? Put them below and we will help you out!
Course Content
33 Lessons
1
The Grammar of Graphics
04:36
2
Scatterplots
03:40
3
Histograms
04:51
4
Bar Charts
04:53
5
Setting color and fill Aesthetic Properties
02:43
6
Setting color and fill Scales
05:12
7
Setting x and y Scales
02:58
8
Adding Text to Plots
05:50
9
Plot Labels
02:59
10
Themes
02:10
11
Facets
02:56
12
Save Plots
02:49
13
Bring it All Together (Data Visualization)
06:14
You need to be signed-in to comment on this post. Login.