Posts

Final Project: Build and Share Your Own R Package

GitHub Repository:  AustinTCurtis/AustinCurtis: For LIS4370 For my final project, I created an R package called co2Package , which analyzes annual CO₂ emissions by country. The package uses a real dataset from Our World in Data and includes custom functions, defensive programming, documentation, and a full vignette. This project helped me understand how professional R packages are built, tested, and documented, while also giving me hands-on experience with GitHub version control. Choosing the Dataset I selected the Annual CO2 Emissions per Country dataset from Our World in Data because: It is real-world, meaningful, and relevant to climate policy. It contains multiple variables suitable for analysis. It provides an opportunity to visualize long-term global trends. It fits naturally into a package with summary and plotting functions. After downloading the CSV file, I imported, cleaned, and standardized the variable names inside my package using a script stored in...

Module 12: Introduction to R Markdown

Image
Github Repository:  AustinTCurtis/r-programming-assignments This week, I learned how R Markdown combines plain text, code, and mathematical notation into one cohesive document. Markdown syntax makes it easy to structure reports using headings, lists, bold or italic text, and hyperlinks, while LaTeX provides a clean way to display mathematical expressions both inline (e.g., $\alpha + \beta = \gamma$) and in block form. I found that code chunks and narrative sections integrate seamlessly. Each R code block runs automatically during knitting, and the results appear directly below the explanations. This helps ensure transparency and reproducibility anyone can see both the code and its output together in context. One of the main challenges I faced was understanding the correct placement of code chunks and ensuring they were surrounded by triple backticks  ```{r}  at first, my math syntax didn’t render properly, and my code didn’t execute because I had written it outside the ch...

Module 11: Debugging and Defensive Programming in R

GitHub Repository:  AustinTCurtis/r-programming-assignments We’re working with a function designed to flag rows in a numeric matrix that are outliers in every column according to the Tukey rule.  However, the original code contains a deliberate bug, specifically in the line that uses the && operator. Original (Buggy) Code: # Helper function for Tukey's rule tukey.outlier <- function(v) {   qs  <- stats::quantile(v, c(0.25, 0.75), na.rm = TRUE)   iqr <- diff(qs)   (v < qs[1] - 1.5 * iqr) | (v > qs[2] + 1.5 * iqr) } # Original (buggy) function tukey_multiple <- function(x) {   outliers <- array(TRUE, dim = dim(x))   for (j in 1:ncol(x)) {     outliers[, j] <- outliers[, j] && tukey.outlier(x[, j])  # <-- BUG: '&&'   }   outlier.vec <- vector("logical", length = nrow(x))   for (i in 1:nrow(x)) {     outlier.vec[i] <- all(outliers[i, ])   }   r...

Module 10: Building Your Own R Package

The Curtis R package is designed to help students, researchers, and data analysts explore and visualize vehicle efficiency and emissions data from the U.S. Environmental Protection Agency’s SmartWay program. Its purpose is to simplify the process of cleaning, analyzing, and comparing fuel efficiency trends across different vehicle classes and model years. The target audience includes learners in data science and environmental analytics courses, as well as professionals interested in sustainability reporting and transportation efficiency research. By combining robust data-handling functions with clear visualization tools, Curtis allows users to generate high-quality plots and insights without needing extensive coding experience. Key Functions The creation of "Curtis" package will include several core functions designed for accessibility and analytical depth: read_smartway(path) – Imports and cleans EPA SmartWay CSV files, ensuring consistent column names and data types. plot...

Module 9: Visualization in R – Base Graphics, Lattice, and ggplot2

Image
GitHub Repository:  AustinTCurtis/r-programming-assignments Choose one dataset from the  Rdatasets collection: data("iris", package = "datasets") head(iris) df <- iris Base R Graphics Create at least two plots using base R functions. # Scatterplot: species_cols <- c(setosa = "steelblue", versicolor = "orange", virginica = "forestgreen") plot(df$Sepal.Length, df$Sepal.Width,      main = "Base: Sepal.Width vs Sepal.Length",      xlab = "Sepal Length", ylab = "Sepal Width",      col = species_cols[df$Species], pch = 19) legend("topright", legend = levels(df$Species),        col = species_cols, pch = 19, bty = "n") # Histogram: hist(df$Petal.Length,      main = "Base: Distribution of Petal Length",      xlab = "Petal Length") Lattice Graphics Use the lattice package to produce conditioned or multivariate plots. # Lattice Graphics library(lattice) # Conditional scatt...

Module 8: Input/Output, String Manipulation, and the plyr Package

Image
GitHub Repository:  AustinTCurtis/r-programming-assignments In this post, I’ll walk through each line of code I used for the  Assignment 8 , explaining what every step does and why it’s important. The goal of this assignment was to import a dataset, analyze grades by gender, filter specific names, and export the results into different file formats using R Studio. Step 1:  Importing the Dataset student6 <- read.csv(file.choose(), header = TRUE, stringsAsFactors = FALSE) This line opens an interactive file-chooser window so I can select my dataset (a CSV file). header = TRUE tells R that the first row contains the column names (Name, Age, Sex, Grade). stringsAsFactors = FALSE ensures that text columns (like Name or Sex) stay as character strings rather than being automatically converted to factors. Step 2:  Checking the Data head(student6) str(student6) head(student6) displays the first few rows so I can preview the dataset. str(student6) shows each column’s data ty...

Module 7: Exploring R’s Object Oriented Systems (S3 & S4)

GITHUB REPOSITORY:   r-programming-assignments/Assignment_07.R at main · AustinTCurtis/r-programming-assignments   Choose or Download Data Load an existing dataset (e.g.,  data("mtcars") ) or download/create your own. Show the first few rows with  head()  and describe its structure with  str() . R CODE: # Built-in dataset data("mtcars") # Peek at the data head(mtcars) str(mtcars) Test Generic Functions Pick one or more base generic functions (e.g.,  print() ,  summary() ,  plot() ). Apply them to your dataset or an object derived from it. If a generic does *not* dispatch on your object, explain *why* (e.g., no method defined for that class). R CODE: # These are all generics that dispatch methods based on class print(mtcars)          # Uses print.data.frame summary(mtcars)        # Uses summary.data.frame # Derived an object fit <- lm(mpg ~ hp + wt, data = mtcars) class(fit) summary(fit)  ...