library(tidyverse)
library(readxl)
library(pwr)
library(ggdist)
library(knitr)
library(kableExtra)

Background

This report presents a statistical analysis of sperm motility data collected from Atelopus sp. specimens at Fundación Jambatu as part of an ongoing amphibian cryopreservation research programme. The dataset used here is a simulated representation of pilot experimental findings, produced to protect confidential project information while preserving the statistical characteristics of the original data.

Two cryoprotectant formulations (CPA1 and CPA2) were evaluated across four time lapses (5, 10, 15, and 20 minutes post-activation). The primary outcome measure is the percentage of motile sperm at each time point.

Data

xlr <- read_xlsx("prd.xlsx", range = "A1:C40", col_names = FALSE)
colnames(xlr) <- c("Treatment", "Time_Lapse", "Motility")

xlr$Time_Lapse <- factor(xlr$Time_Lapse,
                         levels = c(5, 10, 15, 20))

cpa1r <- xlr %>% filter(Treatment == "CPA1")
cpa2r <- xlr %>% filter(Treatment == "CPA2")

cpa1_summary <- cpa1r %>%
  group_by(Time_Lapse) %>%
  summarize(Mean = mean(Motility, na.rm = TRUE),
            SD   = sd(Motility,   na.rm = TRUE),
            N    = n())

cpa2_summary <- cpa2r %>%
  group_by(Time_Lapse) %>%
  summarize(Mean = mean(Motility, na.rm = TRUE),
            SD   = sd(Motility,   na.rm = TRUE),
            N    = n())

Descriptive statistics

kable(cpa1_summary,
      digits  = 2,
      caption = "Table 1. Mean motility (%) per time lapse — CPA1") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE)
Table 1. Mean motility (%) per time lapse — CPA1
Time_Lapse Mean SD N
5 71.0 18.40 5
10 62.6 20.70 5
15 52.6 7.27 5
20 58.0 27.09 5
kable(cpa2_summary,
      digits  = 2,
      caption = "Table 2. Mean motility (%) per time lapse — CPA2") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE)
Table 2. Mean motility (%) per time lapse — CPA2
Time_Lapse Mean SD N
5 79.2 4.02 5
10 69.6 3.97 5
15 55.6 17.43 5
20 61.8 10.13 5

Visualisation

Raincloud plots combine a half-violin (distribution shape), a box plot (median and IQR), and the raw jittered data points. This format is recommended for small samples because it communicates distribution shape and individual variation that bar plots hide (Allen et al., 2019; Weissgerber et al., 2015).

pal <- c("CPA1" = "#2E6F8E", "CPA2" = "#1D6A52")

ggplot(xlr, aes(x = Time_Lapse, y = Motility,
                fill = Treatment, colour = Treatment)) +
  stat_halfeye(adjust = 0.5, width = 0.4, justification = -0.3,
               .width = 0, point_colour = NA, alpha = 0.7) +
  geom_boxplot(width = 0.15, outlier.shape = NA,
               alpha = 0.5, colour = "grey30") +
  geom_jitter(width = 0.05, alpha = 0.6, size = 1.8) +
  scale_fill_manual(values = pal) +
  scale_colour_manual(values = pal) +
  facet_wrap(~ Treatment, ncol = 2) +
  scale_y_continuous(limits = c(40, 110), breaks = seq(40, 100, 20)) +
  labs(x = "Time lapse (minutes)", y = "Motility (%)") +
  theme_minimal(base_size = 12) +
  theme(legend.position  = "none",
        strip.text        = element_text(face = "bold", size = 11),
        panel.grid.minor  = element_blank())
Figure 1. Sperm motility across time lapses for CPA1 and CPA2. Half-violins show the distribution; box plots show the median and IQR; points show individual observations.

Figure 1. Sperm motility across time lapses for CPA1 and CPA2. Half-violins show the distribution; box plots show the median and IQR; points show individual observations.

Statistical analysis

Two-way ANOVA

A two-way ANOVA was fitted with Treatment and Time_Lapse as fixed factors and their interaction term, using Motility as the continuous dependent variable. The null hypotheses are: (1) Treatment has no effect on mean motility; (2) Time_Lapse has no effect on mean motility; (3) the effect of time does not differ between treatments.

anova_two_way <- aov(Motility ~ Treatment * Time_Lapse, data = xlr)

kable(broom::tidy(anova_two_way),
      digits  = 4,
      caption = "Table 3. Two-way ANOVA results.") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE)
Table 3. Two-way ANOVA results.
term df sumsq meansq statistic p.value
Treatment 1 302.5 302.5000 1.2148 0.2786
Time_Lapse 3 2422.8 807.6000 3.2432 0.0347
Treatment:Time_Lapse 3 46.7 15.5667 0.0625 0.9792
Residuals 32 7968.4 249.0125 NA NA

Treatment alone did not produce a significant difference in mean motility between CPA1 and CPA2. Time_Lapse was significant, indicating motility changes over time regardless of cryoprotectant. The non-significant interaction term suggests both formulations follow a similar time-dependent trajectory.

One-way ANOVA per treatment

To determine whether motility differs significantly across time lapses within each formulation independently, one-way ANOVAs were fitted separately for CPA1 and CPA2.

anova_cpa1 <- aov(Motility ~ factor(Time_Lapse), data = cpa1r)
anova_cpa2 <- aov(Motility ~ factor(Time_Lapse), data = cpa2r)

kable(broom::tidy(anova_cpa1),
      digits  = 4,
      caption = "Table 4. One-way ANOVA — CPA1.") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE)
Table 4. One-way ANOVA — CPA1.
term df sumsq meansq statistic p.value
factor(Time_Lapse) 3 910.55 303.5167 0.7815 0.5215
Residuals 16 6214.40 388.4000 NA NA
kable(broom::tidy(anova_cpa2),
      digits  = 4,
      caption = "Table 5. One-way ANOVA — CPA2.") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE)
Table 5. One-way ANOVA — CPA2.
term df sumsq meansq statistic p.value
factor(Time_Lapse) 3 1558.95 519.650 4.7403 0.015
Residuals 16 1754.00 109.625 NA NA

CPA1 showed no significant time effect (p > 0.05). CPA2 showed a significant time effect (p < 0.05), warranting post-hoc testing.

Tukey HSD post-hoc test — CPA2

model_cpa2 <- aov(Motility ~ factor(Time_Lapse), data = cpa2r)

tukey_res <- TukeyHSD(model_cpa2)

kable(broom::tidy(tukey_res),
      digits  = 4,
      caption = "Table 6. Tukey HSD pairwise comparisons — CPA2.") %>%
  kable_styling(bootstrap_options = c("striped", "hover"),
                full_width = FALSE) %>%
  row_spec(which(broom::tidy(tukey_res)$adj.p.value < 0.05),
           bold = TRUE, color = "white", background = "#1D6A52")
Table 6. Tukey HSD pairwise comparisons — CPA2.
term contrast null.value estimate conf.low conf.high adj.p.value
factor(Time_Lapse) 10-5 0 -9.6 -28.5455 9.3455 0.4886
factor(Time_Lapse) 15-5 0 -23.6 -42.5455 -4.6545 0.0124
factor(Time_Lapse) 20-5 0 -17.4 -36.3455 1.5455 0.0776
factor(Time_Lapse) 15-10 0 -14.0 -32.9455 4.9455 0.1907
factor(Time_Lapse) 20-10 0 -7.8 -26.7455 11.1455 0.6487
factor(Time_Lapse) 20-15 0 6.2 -12.7455 25.1455 0.7861

Significant declines in motility were observed between Time 5 and Time 15, and between Time 5 and Time 20 (p adj < 0.05, highlighted above). All other pairwise comparisons were non-significant, suggesting motility stabilises after the initial drop.

Sample size determination

The current dataset derives from a pilot sample. A power analysis estimates the minimum observations per group required to detect a medium effect in a future full study with adequate statistical confidence.

pwr_result <- pwr.anova.test(k         = 4,
                             f         = 0.25,
                             sig.level = 0.05,
                             power     = 0.80)
pwr_result
## 
##      Balanced one-way analysis of variance power calculation 
## 
##               k = 4
##               n = 44.59927
##               f = 0.25
##       sig.level = 0.05
##           power = 0.8
## 
## NOTE: n is number in each group

With k = 4 groups, Cohen’s f = 0.25 (medium effect), α = 0.05, and power = 0.80, approximately 45 observations per group are required. The current pilot sample falls below this threshold, which explains why the two-way ANOVA found no significant treatment effect — the study is underpowered for detecting between-CPA differences at this sample size.

Conclusions

  • Neither CPA1 nor CPA2 produced a significantly different overall motility outcome relative to the other under the current sample size.
  • Time_Lapse is a significant predictor of motility for both formulations.
  • CPA2 shows a significant drop in motility at minutes 15 and 20 relative to minute 5 (Tukey HSD, p adj < 0.05); CPA1 does not.
  • A minimum of 45 replicates per time group are recommended for a powered confirmatory study.

Session information

sessionInfo()
## R version 4.5.3 (2026-03-11)
## Platform: x86_64-pc-linux-gnu
## Running under: Ubuntu 24.04.4 LTS
## 
## Matrix products: default
## BLAS:   /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.12.0 
## LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.12.0  LAPACK version 3.12.0
## 
## locale:
##  [1] LC_CTYPE=es_ES.UTF-8       LC_NUMERIC=C              
##  [3] LC_TIME=es_EC.UTF-8        LC_COLLATE=es_ES.UTF-8    
##  [5] LC_MONETARY=es_EC.UTF-8    LC_MESSAGES=es_ES.UTF-8   
##  [7] LC_PAPER=es_EC.UTF-8       LC_NAME=C                 
##  [9] LC_ADDRESS=C               LC_TELEPHONE=C            
## [11] LC_MEASUREMENT=es_EC.UTF-8 LC_IDENTIFICATION=C       
## 
## time zone: America/Guayaquil
## tzcode source: system (glibc)
## 
## attached base packages:
## [1] stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] kableExtra_1.4.0 knitr_1.50       ggdist_3.3.3     pwr_1.3-0       
##  [5] readxl_1.4.5     lubridate_1.9.4  forcats_1.0.0    stringr_1.5.1   
##  [9] dplyr_1.1.4      purrr_1.1.0      readr_2.1.5      tidyr_1.3.1     
## [13] tibble_3.3.0     ggplot2_3.5.2    tidyverse_2.0.0 
## 
## loaded via a namespace (and not attached):
##  [1] sass_0.4.10          generics_0.1.4       xml2_1.3.8          
##  [4] stringi_1.8.7        rematch_2.0.0        hms_1.1.3           
##  [7] digest_0.6.37        magrittr_2.0.3       evaluate_1.0.4      
## [10] grid_4.5.3           timechange_0.3.0     RColorBrewer_1.1-3  
## [13] fastmap_1.2.0        cellranger_1.1.0     jsonlite_2.0.0      
## [16] backports_1.5.0      viridisLite_0.4.2    scales_1.4.0        
## [19] textshaping_1.0.1    jquerylib_0.1.4      cli_3.6.5           
## [22] rlang_1.1.6          withr_3.0.2          cachem_1.1.0        
## [25] yaml_2.3.10          tools_4.5.3          tzdb_0.5.0          
## [28] broom_1.0.9          vctrs_0.6.5          R6_2.6.1            
## [31] lifecycle_1.0.4      pkgconfig_2.0.3      pillar_1.11.0       
## [34] bslib_0.9.0          gtable_0.3.6         glue_1.8.0          
## [37] Rcpp_1.1.0           systemfonts_1.3.2    xfun_0.52           
## [40] tidyselect_1.2.1     rstudioapi_0.17.1    farver_2.1.2        
## [43] htmltools_0.5.8.1    rmarkdown_2.29       svglite_2.2.2       
## [46] compiler_4.5.3       distributional_0.5.0