Module 9: Visualization in R – Base Graphics, Lattice, and ggplot2


Choose one dataset from the Rdatasets collection:

data("iris", package = "datasets")
head(iris)

df <- iris

Base R Graphics
Create at least two plots using base R functions.

# Scatterplot:
species_cols <- c(setosa = "steelblue", versicolor = "orange", virginica = "forestgreen")
plot(df$Sepal.Length, df$Sepal.Width,
     main = "Base: Sepal.Width vs Sepal.Length",
     xlab = "Sepal Length", ylab = "Sepal Width",
     col = species_cols[df$Species], pch = 19)
legend("topright", legend = levels(df$Species),
       col = species_cols, pch = 19, bty = "n")

# Histogram:
hist(df$Petal.Length,
     main = "Base: Distribution of Petal Length",
     xlab = "Petal Length")

Lattice Graphics
Use the lattice package to produce conditioned or multivariate plots.

# Lattice Graphics
library(lattice)

# Conditional scatter (small multiples)
xyplot(Sepal.Width ~ Sepal.Length | Species,
       data = df,
       main = "Lattice: Sepal.Width vs Sepal.Length by Species",
       pch = 16)

# Box-and-whisker plot:
bwplot(Petal.Length ~ Species,
       data = df,
       main = "Lattice: Petal Length by Species")

ggplot2
Use ggplot2’s grammar of graphics to create layered visuals.

# ggplot2
library(ggplot2)

# Scatter plot with smoothing
ggplot(df, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
  geom_point() +
  geom_smooth(method = "lm", se = FALSE) +
  labs(title = "ggplot2: Sepal.Width vs Sepal.Length with Linear Trend")

# Faceted histogram:
ggplot(df, aes(Petal.Length)) +
  geom_histogram(binwidth = 0.2) +
  facet_wrap(~ Species) +
  labs(title = "ggplot2: Petal Length Distribution by Species",
       x = "Petal Length", y = "Count")

  • How does the syntax and workflow differ between base, lattice, and ggplot2?
  • Which system gave you the most control or produced the most “publication‑quality” output with minimal code?
  • Any challenges or surprises you encountered when switching between systems.
The syntax and workflow between base R graphics, lattice, and ggplot2 differ significantly, particularly in how each handles plotting. Base R graphics is more hands-on and step-by-step, meaning each command you write instantly draws something on the screen. For example, if you want to make a scatter plot with colors and a legend, you have to use separate commands like plot() and legend() . This gives you a lot of control over how your graph looks and lets you see changes right away, but it can also get repetitive and time-consuming when you’re trying to make more complex or multi-part visualizations.

Lattice graphics work a bit differently because they use a formula-style syntax, such as y ~ x | factor(group), to easily show how one variable changes across different groups. This setup makes it easy to create multiple related plots or compare several variables simultaneously with just one command. Unlike base R, where each plot is drawn immediately, lattice plots are created as objects, making it easier to organize and reuse them. However, for a newer coder, customizing lattice plots can be a little tricky since you often have to use extra arguments or special settings instead of simple function calls. It’s powerful once you get the hang of it, but it takes a little more practice to make the plots look precisely how you want.

ggplot2 takes a distinctly different approach to creating graphs compared to other systems. It utilizes a concept known as the “Grammar of Graphics,” which involves building plots step by step by adding layers. You start by setting up your data and aesthetics with aes(), then add pieces like points, lines, or smooth trend lines using geom_ functions. This method is very flexible and easy to adjust. If you want to change one part of the plot, you can just update that layer instead of redoing the whole thing. Another significant advantage is that ggplot2 automatically handles aspects such as colors, legends, and themes, ensuring that plots look professional even with a minimal amount of code. For someone newer to R, this makes ggplot2 a powerful and user-friendly option for creating high-quality visualizations.

When comparing the ease of use of each system, I found that ggplot2 provided the best balance of control and simplicity. Its default visuals already look clean and professional, and adding layers made it easy to keep improving the plot step by step. Lattice was great for showing different groups of data with very little code, but it took some time to understand how to customize it. Base R graphics provided the most direct control, but they also required more lines of code and additional steps to make the plots look polished. The hardest part was adjusting to the different coding styles, especially transitioning from base R’s simple plotting commands to ggplot2’s layered approach. Once I got used to it, though, ggplot2 felt the most beginner-friendly and the best choice for creating clear, professional-looking graphs.

Comments

Popular posts from this blog

Module #4 Visualizing and Interpreting Hospital Patient Data

Module # 2 Assignment Importing Data and Function Evaluation in R

Module 6: Matrix Operations and Construction