Module 11: Debugging and Defensive Programming in R

GitHub Repository: 
AustinTCurtis/r-programming-assignments

We’re working with a function designed to flag rows in a numeric matrix that are outliers in every column according to the Tukey rule. However, the original code contains a deliberate bug, specifically in the line that uses the && operator.

Original (Buggy) Code:

# Helper function for Tukey's rule
tukey.outlier <- function(v) {
  qs  <- stats::quantile(v, c(0.25, 0.75), na.rm = TRUE)
  iqr <- diff(qs)
  (v < qs[1] - 1.5 * iqr) | (v > qs[2] + 1.5 * iqr)
}

# Original (buggy) function
tukey_multiple <- function(x) {
  outliers <- array(TRUE, dim = dim(x))
  for (j in 1:ncol(x)) {
    outliers[, j] <- outliers[, j] && tukey.outlier(x[, j])  # <-- BUG: '&&'
  }
  outlier.vec <- vector("logical", length = nrow(x))
  for (i in 1:nrow(x)) {
    outlier.vec[i] <- all(outliers[i, ])
  }
  return(outlier.vec)
}

Reproducing the Error:

set.seed(123)
test_mat <- matrix(rnorm(50), nrow = 10)
tukey_multiple(test_mat)

Error in outliers[, j] && tukey.outlier(x[, j]) : 
  'length = 10' in coercion to 'logical (1)'

Diagnosing the Bug:
- The error occurs because && only evaluates the first element of each logical vector and returns a single TRUE/FALSE.
- Since outliers [, j] is a vector of length 10, R attempts to assign a length-1 result to a longer object, triggering a mismatch error.
- Even if it didn’t throw an error, the logic would still be incorrect each column would collapse into one Boolean value.

Corrected Function:

corrected_tukey <- function(x) {
  # Defensive checks
  if (!is.matrix(x)) stop("`x` must be a matrix.")
  if (!is.numeric(x)) stop("`x` must be a numeric matrix (no character/factor columns).")
  if (ncol(x) == 0L || nrow(x) == 0L) stop("`x` must have at least 1 row and 1 column.")
  
  outliers <- array(TRUE, dim = dim(x))
  for (j in seq_len(ncol(x))) {
    outliers[, j] <- outliers[, j] & tukey.outlier(x[, j])  # <-- FIX: '&'
  }
  outlier.vec <- logical(nrow(x))
  for (i in seq_len(nrow(x))) {
    outlier.vec[i] <- all(outliers[i, ])
  }
  outlier.vec
}

Validating the Fix:

set.seed(123)
test_mat <- matrix(rnorm(50), nrow = 10)
corrected_tukey(test_mat)

Output:

[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE


Comments

Popular posts from this blog

Module #4 Visualizing and Interpreting Hospital Patient Data

Module # 2 Assignment Importing Data and Function Evaluation in R

Module 6: Matrix Operations and Construction