Skip to content
R for the Rest of Us Logo

Going Deeper with R

Bring It All Together (Advanced Data Wrangling)

Transcript

Click on the transcript to go to that point in the video. Please note that transcripts are auto generated and may contain minor inaccuracies.

Your Turn

The import-data.R file is here:

# Load Packages ----------------------------------------------------------

library(tidyverse)
library(janitor)

# Import Data ------------------------------------------------------------

survey_data_raw <-
  read_tsv("data-raw/2020-combined-survey-final.tsv") |>
  clean_names()

coding_languages <-
  survey_data_raw |>
  mutate(id = row_number()) |>
  select(id, qcoding_languages) |>
  separate_longer_delim(
    cols = qcoding_languages,
    delim = ", "
  ) |>
  filter(qcoding_languages != "R")

demographics <-
  survey_data_raw |>
  mutate(id = row_number()) |>
  select(id, qyear_born:qcountry)

# Export Data ------------------------------------------------------------

coding_languages |>
  write_rds("data/coding_languages.rds")

demographics |>
  write_rds("data/demographics.rds")

And the report.qmd file is here:

---
title: R Community Survey
format: html
execute: 
  echo: false
  warning: false
  message: false
---


```{r}
library(tidyverse)
```


```{r}
coding_languages <-
  read_rds("data/coding_languages.rds")

demographics <-
  read_rds("data/demographics.rds")
```

Here is a chart showing the most common languages other than R in the survey data.

```{r}
coding_languages |>
  count(qcoding_languages) |>
  slice_max(
    order_by = n,
    n = 10
  ) |>
  ggplot(
    aes(
      x = n,
      y = qcoding_languages
    )
  ) +
  geom_col()
```

Here is a chart showing the most common languages other than R in the survey data.

```{r}
coding_languages |>
  left_join(demographics, join_by(id)) |>
  group_by(qdegree) |>
  count(qcoding_languages) |>
  slice_max(
    order_by = n,
    n = 3
  ) |>
  ungroup() |>
  filter(
    qdegree %in%
      c(
        "Bachelor’s degree (e.g. BA, BS)",
        "Master’s degree (e.g. MA, MS, MEd)",
        "Doctorate (e.g. PhD, EdD)"
      )
  ) |>
  ggplot(
    aes(
      x = n,
      y = qcoding_languages
    )
  ) +
  geom_col() +
  facet_wrap(vars(qdegree), ncol = 1)
```

Learn More

The data comes from the 2020 R Community Survey.

Have any questions? Put them below and we will help you out!

You need to be signed-in to comment on this post. Login.

Course Content

44 Lessons