library(tidyverse)
library(readxl)
library(pwr)
library(ggdist)
library(knitr)
library(kableExtra)
This report presents a statistical analysis of sperm motility data collected from Atelopus sp. specimens at Fundación Jambatu as part of an ongoing amphibian cryopreservation research programme. The dataset used here is a simulated representation of pilot experimental findings, produced to protect confidential project information while preserving the statistical characteristics of the original data.
Two cryoprotectant formulations (CPA1 and CPA2) were evaluated across four time lapses (5, 10, 15, and 20 minutes post-activation). The primary outcome measure is the percentage of motile sperm at each time point.
xlr <- read_xlsx("prd.xlsx", range = "A1:C40", col_names = FALSE)
colnames(xlr) <- c("Treatment", "Time_Lapse", "Motility")
xlr$Time_Lapse <- factor(xlr$Time_Lapse,
levels = c(5, 10, 15, 20))
cpa1r <- xlr %>% filter(Treatment == "CPA1")
cpa2r <- xlr %>% filter(Treatment == "CPA2")
cpa1_summary <- cpa1r %>%
group_by(Time_Lapse) %>%
summarize(Mean = mean(Motility, na.rm = TRUE),
SD = sd(Motility, na.rm = TRUE),
N = n())
cpa2_summary <- cpa2r %>%
group_by(Time_Lapse) %>%
summarize(Mean = mean(Motility, na.rm = TRUE),
SD = sd(Motility, na.rm = TRUE),
N = n())
kable(cpa1_summary,
digits = 2,
caption = "Table 1. Mean motility (%) per time lapse — CPA1") %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE)
| Time_Lapse | Mean | SD | N |
|---|---|---|---|
| 5 | 71.0 | 18.40 | 5 |
| 10 | 62.6 | 20.70 | 5 |
| 15 | 52.6 | 7.27 | 5 |
| 20 | 58.0 | 27.09 | 5 |
kable(cpa2_summary,
digits = 2,
caption = "Table 2. Mean motility (%) per time lapse — CPA2") %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE)
| Time_Lapse | Mean | SD | N |
|---|---|---|---|
| 5 | 79.2 | 4.02 | 5 |
| 10 | 69.6 | 3.97 | 5 |
| 15 | 55.6 | 17.43 | 5 |
| 20 | 61.8 | 10.13 | 5 |
Raincloud plots combine a half-violin (distribution shape), a box plot (median and IQR), and the raw jittered data points. This format is recommended for small samples because it communicates distribution shape and individual variation that bar plots hide (Allen et al., 2019; Weissgerber et al., 2015).
pal <- c("CPA1" = "#2E6F8E", "CPA2" = "#1D6A52")
ggplot(xlr, aes(x = Time_Lapse, y = Motility,
fill = Treatment, colour = Treatment)) +
stat_halfeye(adjust = 0.5, width = 0.4, justification = -0.3,
.width = 0, point_colour = NA, alpha = 0.7) +
geom_boxplot(width = 0.15, outlier.shape = NA,
alpha = 0.5, colour = "grey30") +
geom_jitter(width = 0.05, alpha = 0.6, size = 1.8) +
scale_fill_manual(values = pal) +
scale_colour_manual(values = pal) +
facet_wrap(~ Treatment, ncol = 2) +
scale_y_continuous(limits = c(40, 110), breaks = seq(40, 100, 20)) +
labs(x = "Time lapse (minutes)", y = "Motility (%)") +
theme_minimal(base_size = 12) +
theme(legend.position = "none",
strip.text = element_text(face = "bold", size = 11),
panel.grid.minor = element_blank())
Figure 1. Sperm motility across time lapses for CPA1 and CPA2. Half-violins show the distribution; box plots show the median and IQR; points show individual observations.
A two-way ANOVA was fitted with Treatment and Time_Lapse as fixed factors and their interaction term, using Motility as the continuous dependent variable. The null hypotheses are: (1) Treatment has no effect on mean motility; (2) Time_Lapse has no effect on mean motility; (3) the effect of time does not differ between treatments.
anova_two_way <- aov(Motility ~ Treatment * Time_Lapse, data = xlr)
kable(broom::tidy(anova_two_way),
digits = 4,
caption = "Table 3. Two-way ANOVA results.") %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE)
| term | df | sumsq | meansq | statistic | p.value |
|---|---|---|---|---|---|
| Treatment | 1 | 302.5 | 302.5000 | 1.2148 | 0.2786 |
| Time_Lapse | 3 | 2422.8 | 807.6000 | 3.2432 | 0.0347 |
| Treatment:Time_Lapse | 3 | 46.7 | 15.5667 | 0.0625 | 0.9792 |
| Residuals | 32 | 7968.4 | 249.0125 | NA | NA |
Treatment alone did not produce a significant difference in mean motility between CPA1 and CPA2. Time_Lapse was significant, indicating motility changes over time regardless of cryoprotectant. The non-significant interaction term suggests both formulations follow a similar time-dependent trajectory.
To determine whether motility differs significantly across time lapses within each formulation independently, one-way ANOVAs were fitted separately for CPA1 and CPA2.
anova_cpa1 <- aov(Motility ~ factor(Time_Lapse), data = cpa1r)
anova_cpa2 <- aov(Motility ~ factor(Time_Lapse), data = cpa2r)
kable(broom::tidy(anova_cpa1),
digits = 4,
caption = "Table 4. One-way ANOVA — CPA1.") %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE)
| term | df | sumsq | meansq | statistic | p.value |
|---|---|---|---|---|---|
| factor(Time_Lapse) | 3 | 910.55 | 303.5167 | 0.7815 | 0.5215 |
| Residuals | 16 | 6214.40 | 388.4000 | NA | NA |
kable(broom::tidy(anova_cpa2),
digits = 4,
caption = "Table 5. One-way ANOVA — CPA2.") %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE)
| term | df | sumsq | meansq | statistic | p.value |
|---|---|---|---|---|---|
| factor(Time_Lapse) | 3 | 1558.95 | 519.650 | 4.7403 | 0.015 |
| Residuals | 16 | 1754.00 | 109.625 | NA | NA |
CPA1 showed no significant time effect (p > 0.05). CPA2 showed a significant time effect (p < 0.05), warranting post-hoc testing.
model_cpa2 <- aov(Motility ~ factor(Time_Lapse), data = cpa2r)
tukey_res <- TukeyHSD(model_cpa2)
kable(broom::tidy(tukey_res),
digits = 4,
caption = "Table 6. Tukey HSD pairwise comparisons — CPA2.") %>%
kable_styling(bootstrap_options = c("striped", "hover"),
full_width = FALSE) %>%
row_spec(which(broom::tidy(tukey_res)$adj.p.value < 0.05),
bold = TRUE, color = "white", background = "#1D6A52")
| term | contrast | null.value | estimate | conf.low | conf.high | adj.p.value |
|---|---|---|---|---|---|---|
| factor(Time_Lapse) | 10-5 | 0 | -9.6 | -28.5455 | 9.3455 | 0.4886 |
| factor(Time_Lapse) | 15-5 | 0 | -23.6 | -42.5455 | -4.6545 | 0.0124 |
| factor(Time_Lapse) | 20-5 | 0 | -17.4 | -36.3455 | 1.5455 | 0.0776 |
| factor(Time_Lapse) | 15-10 | 0 | -14.0 | -32.9455 | 4.9455 | 0.1907 |
| factor(Time_Lapse) | 20-10 | 0 | -7.8 | -26.7455 | 11.1455 | 0.6487 |
| factor(Time_Lapse) | 20-15 | 0 | 6.2 | -12.7455 | 25.1455 | 0.7861 |
Significant declines in motility were observed between Time 5 and Time 15, and between Time 5 and Time 20 (p adj < 0.05, highlighted above). All other pairwise comparisons were non-significant, suggesting motility stabilises after the initial drop.
The current dataset derives from a pilot sample. A power analysis estimates the minimum observations per group required to detect a medium effect in a future full study with adequate statistical confidence.
pwr_result <- pwr.anova.test(k = 4,
f = 0.25,
sig.level = 0.05,
power = 0.80)
pwr_result
##
## Balanced one-way analysis of variance power calculation
##
## k = 4
## n = 44.59927
## f = 0.25
## sig.level = 0.05
## power = 0.8
##
## NOTE: n is number in each group
With k = 4 groups, Cohen’s f = 0.25 (medium effect), α = 0.05, and power = 0.80, approximately 45 observations per group are required. The current pilot sample falls below this threshold, which explains why the two-way ANOVA found no significant treatment effect — the study is underpowered for detecting between-CPA differences at this sample size.
sessionInfo()
## R version 4.5.3 (2026-03-11)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
##
## Matrix products: default
## BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0 LAPACK version 3.12.0
##
## locale:
## [1] LC_CTYPE=es_ES.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=es_EC.UTF-8 LC_COLLATE=es_ES.UTF-8
## [5] LC_MONETARY=es_EC.UTF-8 LC_MESSAGES=es_ES.UTF-8
## [7] LC_PAPER=es_EC.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=es_EC.UTF-8 LC_IDENTIFICATION=C
##
## time zone: America/Guayaquil
## tzcode source: system (glibc)
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] kableExtra_1.4.0 knitr_1.50 ggdist_3.3.3 pwr_1.3-0
## [5] readxl_1.4.5 lubridate_1.9.4 forcats_1.0.0 stringr_1.5.1
## [9] dplyr_1.1.4 purrr_1.1.0 readr_2.1.5 tidyr_1.3.1
## [13] tibble_3.3.0 ggplot2_3.5.2 tidyverse_2.0.0
##
## loaded via a namespace (and not attached):
## [1] sass_0.4.10 generics_0.1.4 xml2_1.3.8
## [4] stringi_1.8.7 rematch_2.0.0 hms_1.1.3
## [7] digest_0.6.37 magrittr_2.0.3 evaluate_1.0.4
## [10] grid_4.5.3 timechange_0.3.0 RColorBrewer_1.1-3
## [13] fastmap_1.2.0 cellranger_1.1.0 jsonlite_2.0.0
## [16] backports_1.5.0 viridisLite_0.4.2 scales_1.4.0
## [19] textshaping_1.0.1 jquerylib_0.1.4 cli_3.6.5
## [22] rlang_1.1.6 withr_3.0.2 cachem_1.1.0
## [25] yaml_2.3.10 tools_4.5.3 tzdb_0.5.0
## [28] broom_1.0.9 vctrs_0.6.5 R6_2.6.1
## [31] lifecycle_1.0.4 pkgconfig_2.0.3 pillar_1.11.0
## [34] bslib_0.9.0 gtable_0.3.6 glue_1.8.0
## [37] Rcpp_1.1.0 systemfonts_1.3.2 xfun_0.52
## [40] tidyselect_1.2.1 rstudioapi_0.17.1 farver_2.1.2
## [43] htmltools_0.5.8.1 rmarkdown_2.29 svglite_2.2.2
## [46] compiler_4.5.3 distributional_0.5.0