Instructions (read first): This is a test‑style, open‑notes, open‑AI formative assessment. Knit a single RMD file to HTML or PDF for submission. You may use AI tools (ChatGPT/Copilot, etc.) to assist with code and writing, but you are responsible for correctness. Feel free to talk to class or lab mates but absolutely no sharing of files is allowed.
Policy reminders: Explain reasoning, check assumptions, visualize data, and justify choices. Avoid p‑hacking. Prefer figures that show the data and are accessible (e.g., color‑blind‑safe palettes).
Create two groups (n = 30 each) of a biological measurement. -
Group A: Normal with mean 10, SD 2.
Group B: Right‑skewed (log‑normal) with meanlog = 2.3
,
sdlog = 0.3
.
Deliverables: A short paragraph describing the data; a table with sample size, mean, SD; and exploratory plots (histogram and box/violin with points).
# --- Option 1: simulate example scaffold (edit as needed) ---
n <- 30
grpA <- rnorm(n, mean = 10, sd = 2)
grpB <- rlnorm(n, meanlog = 2.3, sdlog = 0.3)
group <- factor(rep(c("A","B"), each = n))
y <- c(grpA, grpB)
df <- data.frame(group, y)
summary(df)
# Plots (add labels/titles; consider viridisLite for colors)
hist(grpA, main = "Group A histogram", xlab = "Value")
hist(grpB, main = "Group B histogram", xlab = "Value")
boxplot(y ~ group, data = df, main = "Group comparison", ylab = "Value")
Briefly describe the data and the hypothesized biological mechanism generating it. (3–5 sentences.)
# TODO: Diagnostics and proposed transform
# Example scaffolds (edit/extend)
par(mfrow = c(1,2))
qqnorm(grpA); qqline(grpA)
qqnorm(grpB); qqline(grpB)
shapiro.test(grpA)
shapiro.test(grpB)
# Example transform:
y_log <- if (all(y > 0)) log(y) else y # log only if positive
Write 4–6 sentences interpreting your diagnostics and whether a transform is warranted.
Goal: Test whether groups differ in central tendency and communicate uncertainty.
viridisLite
).# TODO: Analyses
# Welch t-test
t.test(y ~ group, data = df)
# If transformed:
t.test(y_log ~ group, data = transform(df, y_log = ifelse(y>0, log(y), NA)), na.action = na.omit)
# Mann–Whitney
wilcox.test(y ~ group, data = df, exact = FALSE)
# Effect size (Hedges' g) — quick implementation
hedges_g <- function(x, y){
nx <- length(x); ny <- length(y)
sx2 <- var(x); sy2 <- var(y)
sp <- sqrt(((nx-1)*sx2 + (ny-1)*sy2)/(nx+ny-2))
g <- (mean(x) - mean(y))/sp
J <- 1 - 3/(4*(nx+ny)-9) # small-sample correction
g * J
}
with(df, hedges_g(y[group=="A"], y[group=="B"]))
# CIs plot (means +/- 1.96*SE)
agg <- aggregate(y ~ group, df, function(v) c(mean=mean(v), se=sd(v)/sqrt(length(v))))
agg <- data.frame(group = agg$group, mean = agg$y[, "mean"], se = agg$y[, "se"])
agg$lower <- agg$mean - 1.96*agg$se
agg$upper <- agg$mean + 1.96*agg$se
print(agg)
# Simple CI plot (base R)
plot(agg$group, agg$mean, ylim = range(c(agg$lower, agg$upper)), xlab = "Group", ylab = "Mean with 95% CI", pch = 19)
arrows(x0 = 1:2, y0 = agg$lower, x1 = 1:2, y1 = agg$upper, angle = 90, code = 3, length = 0.05)
Interpretation (≤150 words): Are results consistent across tests and (if used) transformations? What do the effect sizes and CIs suggest about biological relevance?
Produce one publication‑quality figure comparing
groups that shows the data (e.g., box/violin + jitter).
Use a color‑blind‑safe palette (e.g.,
viridisLite::viridis(n=2)
). Add an informative caption
stating the key take‑home message.
# TODO: Publishable figure (base or ggplot2 ok). Example (base):
cols <- viridisLite::viridis(2)
stripchart(y ~ group, data = df, vertical = TRUE, pch = 16, col = adjustcolor(cols[group], 0.7), method = "jitter", main = "Group comparison (data shown)", ylab = "Value")
boxplot(y ~ group, data = df, add = TRUE, border = "gray30", col = NA)
Caption (2–3 sentences) explaining the figure and what readers should notice.
Briefly document (bullets are fine): - Which prompts you used (paste 1–2 best prompts). - Where AI helped you move faster. - One place AI was wrong, unclear, or needed correction—and how you fixed it.
sessionInfo()
## R version 4.5.1 (2025-06-13)
## Platform: aarch64-apple-darwin20
## Running under: macOS Sequoia 15.5
##
## Matrix products: default
## BLAS: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRblas.0.dylib
## LAPACK: /Library/Frameworks/R.framework/Versions/4.5-arm64/Resources/lib/libRlapack.dylib; LAPACK version 3.12.1
##
## locale:
## [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
##
## time zone: America/Chicago
## tzcode source: internal
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## loaded via a namespace (and not attached):
## [1] digest_0.6.37 R6_2.6.1 fastmap_1.2.0 xfun_0.52
## [5] cachem_1.1.0 knitr_1.50 htmltools_0.5.8.1 rmarkdown_2.29
## [9] lifecycle_1.0.4 cli_3.6.5 sass_0.4.10 jquerylib_0.1.4
## [13] compiler_4.5.1 rstudioapi_0.17.1 tools_4.5.1 evaluate_1.0.4
## [17] bslib_0.9.0 yaml_2.3.10 rlang_1.1.6 jsonlite_2.0.0