here's a cheat sheet for quick statistics for bioinformatics
| Test | Type | Purpose/Usage | Key Assumptions/Notes | Recommended Graph/Visualization | R/Python Packages or Functions |
|---|---|---|---|---|---|
| One Sample t-test | Parametric | Test if a sample mean equals a specific value | Data are continuous and normally distributed | Histogram, Q-Q plot (for normality), Box plot for central tendency | R: t.test(x) Python: scipy.stats.ttest_1samp(x, popmean) |
| Independent (Two Sample) t-test | Parametric | Compare means between two independent groups | Normality, equal variances, independence | Box plot with group comparisons, Error bar plot | R: t.test(x, y) Python: scipy.stats.ttest_ind(x, y) |
| Paired t-test | Parametric | Compare means of paired/related samples | Differences are normally distributed; paired observations | Paired difference plot, Line plot connecting paired observations, Box plot of differences | R: t.test(x, y, paired=TRUE) Python: scipy.stats.ttest_rel(x, y) |
| ANOVA (Analysis of Variance) | Parametric | Compare means across three or more groups | Normality, homogeneity of variances, independence | Box plot for group comparisons, Means plot with error bars, Bar plot | R: aov() or lm() + anova() Python: scipy.stats.f_oneway(x, y, …) or statsmodels |
| Pearson’s Correlation | Parametric | Measure linear relationship between two continuous variables | Normality, linearity, continuous data | Scatter plot with best-fit regression line | R: cor(x, y, method="pearson") Python: scipy.stats.pearsonr(x, y) |
| Linear Regression | Parametric | Model relationships between a dependent variable and predictors | Linearity, independence, homoscedasticity, normally distributed residuals | Scatter plot with regression line, Residual plots | R: lm() Python: statsmodels.api.OLS or sklearn.linear_model.LinearRegression |
| Mann-Whitney U Test | Nonparametric | Compare medians of two independent groups | Ordinal or continuous data; does not assume normality | Box plot, Violin plot | R: wilcox.test(x, y) Python: scipy.stats.mannwhitneyu(x, y) |
| Wilcoxon Signed-Rank Test | Nonparametric | Compare medians of paired samples | Paired observations; ordinal or continuous data | Box plot of differences, Scatter plot with lines connecting paired samples | R: wilcox.test(x, paired=TRUE) Python: scipy.stats.wilcoxon(x, y) |
| Kruskal-Wallis Test | Nonparametric | Compare medians among three or more independent groups | Ordinal or continuous data; independent groups | Box plot or Violin plot per group | R: kruskal.test() Python: scipy.stats.kruskal(x, y, z, …) |
| Spearman’s Rank Correlation | Nonparametric | Assess monotonic relationship between two variables | Ordinal or continuous data; can handle non-linear relationships | Scatter plot (often with a trend line based on ranks) | R: cor(x, y, method="spearman") Python: scipy.stats.spearmanr(x, y) |
| Chi-Square Test | Nonparametric | Test for association between categorical variables | Expected frequencies sufficiently large; independence of observations | Bar chart, Mosaic plot | R: chisq.test() Python: scipy.stats.chi2_contingency() |
| Friedman Test | Nonparametric | Compare medians across repeated measures or matched groups | Ordinal or continuous data; repeated measures design | Box plots for each condition, Line plot showing trends for individual subjects | R: friedman.test() Python: scipy.stats.friedmanchisquare() |