Resources

Rapid RAG Prototyping

Rapid RAG Prototyping leverages the power of R through the ellmer package and DuckDB to build a Retrieval-Augmented Generation (RAG) prototype, enhancing a large language model with domain-specific knowledge. This solution addresses the limitations of large language models, which often lack current or specific information. The ellmer package provides an interface for working with various LLM providers, adding functions like tool calling and data extraction. DuckDB contributes with high-performance data processing, enabling efficient query handling. Together, they offer a formidable toolkit for fast prototyping of LLM-powered applications.

Go to Resource

Re-constructing Google Forms responses with Quarto and {glue}

This blog post by Eric R. Scott explains how to use Quarto and the {glue} R package to transform Google Forms responses into a more readable application format. Initially dealing with a Google Sheet that collated 50 applicants' data for a short course, Scott outlines the process of converting cumbersome, lengthy answers into clean, application-like documents. The use of the {googlesheets4} package to import data and the manipulation of column names with {janitor} and {stringr} are detailed. Key to the transformation is utilizing Quarto's 'asis' output chunk option alongside {glue} to programmatically create markdown from the dataset.

Go to Resource

Reactable Tutorial

This tutorial provides a guide on using the reactable package in R to create interactive data tables. It covers topics such as setting up the package, modifying column properties, adjusting width and alignment, rendering cell values as HTML, and grouping and aggregation.

Go to Resource

Read files on the web into R

June Choe's tutorial provides valuable insights for R users desiring to read files directly from the web into their R environment. It caters to individuals seeking to streamline their workflow by skipping the download process. The focus is on various data sources like GitHub public repos, gists, private repos, and OSF. Techniques include utilizing the 'raw.githubusercontent.com' URLs for reading CSV files and handling binary files which can't be displayed as plain text. The content covers sessionInfo(), streaming with {duckdb}, and miscellaneous tips for efficient data import in R.

Go to Resource

Read hundreds of Excel files into one dataset with one line of code #shorts #excel #rstats - YouTube

Learn how to read multiple Excel files into one dataset using R with just one line of code.

Go to Resource

Rebecca Barter - Learn to purrr

Learn about the purrr package in R, which provides map functions for iteration and manipulating lists.

Go to Resource

Recreate a real-world, complex dataviz with R & ggplot - YouTube

Go to Resource

Remove or Hide Legends in ggplot2 – Theme, Guides, Scales & Tips

This practical tutorial demonstrates how to remove or hide legends in {ggplot2} plots, covering both complete legend removal and selective legend management for plots with multiple layers. The post shows various approaches using theme settings, the guides() function, and scale modifications. It’s a handy reference for those common situations where you need fine-grained control over which legends to display, especially useful when working with complex multi-layered visualizations.

Go to Resource

Rendering your README with GitHub Actions

This tutorial explains how to use GitHub Actions to automatically render your README.Rmd file to README.md on GitHub.

Go to Resource

Reproducible Data Science in R: Iterate, don't duplicate

This blog post on the Water Data For The Nation Blog guides novice to intermediate R users on how to achieve reproducible data science by replacing code duplication with iteration techniques. It introduces the 'map()' function from the purrr package, explaining its advantages over copy/paste approaches and for loops. The post covers mapping techniques, the usage of lists, various map_*() function variants, and working with multiple inputs or no outputs. It is part of a series aimed at building functional programming skills and creating efficient data workflows with the targets R package.

Go to Resource

Reproducible Data Science in R: Say the quiet part out loud with assertion tests

This blog post explores the role of assertion tests in reproducible data science when using R. The author, Anthony Martinez, walks through a tutorial on improving the robustness of R functions with assertions, beginning with basic checks and evolving to more expressive error messages. The post, intended for novice and intermediate R users, is part of a series on functional programming and reproducibility using the targets package. It highlights the importance of failing early with clear messages in multi-person projects and offers examples using the geoconnex.us database to retrieve Hydrologic Unit boundary polygons.

Go to Resource

Reproducible Data Science in R: Writing functions that work for you

This blog post from the Water Data For The Nation Blog guides readers on crafting custom functions in R for reproducible data science, particularly with water-related data. Starting from the basics, it emphasizes the benefits like consistency, error reduction, and code shortening by avoiding repeated tasks. The post covers function essentials and environments in R, providing a step-by-step tutorial using the Water Quality Portal data. It prepares readers for advanced function usage in R, targeting those with basic programming experience aiming to advance their skills.

Go to Resource