Resources
This carefully curated collection of resources will help you find packages and learning resources to help you on your R journey.
Positron’s AI-powered Databot Tool
Ted Laderas introduces Databot, an experimental AI-powered analysis tool built into Positron (Posit’s new IDE for data science). In this video, Laderas demonstrates how Databot can accelerate exploratory data analysis by automatically writing and executing code to help you understand your data. Using the NHANES public health dataset as an example, he shows how this tool can dramatically speed up the initial stages of data exploration, turning what might take hours into a matter of minutes while keeping the data scientist in control of the process.
Go to Resource
Price index data for the US economy
The 'realtalk' R package provides datasets for common US price indexes such as the CPI-U-RS and PCE. It enables easy access and manipulation of price index data. Users can view available datasets and obtain specific price index data through functions like available_price_indexes and get_price_index(). The package also includes annual, monthly, and quarterly data with various start and end dates. Installation instructions are provided for getting started with the package. This can be especially useful for economists and data analysts working with time series data in the economic domain.
Go to Resource
Privacy and AI Assistants
This blog post by Simon Couch and Sara Altman from Posit discusses the integration of privacy concerns with AI assistants. It provides insights into how AI technology, especially large language models (LLMs), can align with privacy standards. Simon Couch, a software engineer with expertise in R and LLMs, shares his experiences in developing packages for R that enhance LLM capabilities. Additionally, Sara Altman, a data science educator, highlights the resources available through Posit for open-source data science. The post emphasizes the importance of privacy in AI as these technologies become more prevalent in data analysis and software development.
Go to Resource
Project-oriented workflow
This blog post discusses the importance of a project-oriented workflow in R and provides recommendations for organizing data analysis into self-contained projects.
Go to Resource
purrr
Purrr is a package in R that enhances functional programming (FP) toolkit by providing a complete and consistent set of tools for working with functions and vectors.
Go to Resource
purrr tutorial
A tutorial on using the purrr package in R, including examples and lessons on various topics such as vectors, lists, mapping, list columns, and more.
Go to Resource
Python is not a great language for data science. Part 1: The experience
Claus Wilke expresses contention towards Python as the ideal data science language, despite acknowledging its dominance due to historical reasons. He argues that Python has notable limitations for certain data science tasks, particularly when excluding deep learning for which Python excels with PyTorch. Wilke shares experiences from his computational biology lab, observing that tasks he deems simple in R often become cumbersome for his Python-using students, suggesting a discrepancy in tool suitability for data science work outside of deep learning.
Go to Resource
Qualitative Analysis with Large Language Models • quallmer
The quallmer package leverages AI, particularly large language models, for qualitative data analysis. It assists researchers in coding texts, images, PDFs, tabular, and structured data. quallmer simplifies AI-assisted qualitative coding, ensuring the quality and reliability of AI-generated codes with functions for codebook creation, coding, comparison, validation, replication, and documenting audit trails. It supports all LLMs available with the ellmer package and includes a Shiny app for an interactive experience.
Go to Resource
qualtRics
The qualtRics R package implements the retrieval of survey data using the Qualtrics API and aims to reduce the pre-processing steps needed in analyzing such surveys.
Go to Resource
Quantitative Analysis of Textual Data with {quanteda}
The quanteda package is an R tool for text analysis, offering functions for processing natural language and managing textual data. It was developed by Kenneth Benoit and Kohei Watanabe with the support of the European Research Council and Quanteda Initiative CIC. quanteda is particularly useful for researchers and students, providing many features that rival proprietary software, but with the openness and flexibility of R. Version 4.0 introduced improved functionality and smarter tokenisation with Unicode support. The package ecosystem has grown into a family, including modules for text modeling, text statistics, text plotting, and sentiment analysis.
Go to Resource
Quarto Extensions
This content presents a catalog of various Quarto extensions, complete with metadata such as the release date, author, version, and the number of stars on GitHub. The extensions cover a wide array of functionalities to enhance HTML documents, websites, blogs, books, and academic publications. Some specific features include embedding webR, minimalist themes for presentations, APA7 document formatting, countdown timers, inclusion of vector icons, and integration of interactive elements like Shinylive and code editors. The Quarto extensions support a diversity of outputs, such as PDF, HTML, and slides, and cater to needs in publishing, theming, and interactivity.
Go to Resource