Published: October 2016

Last updated: October 2025

Hypothesis testing

A statistical hypothesis test is a method of statistical inference used to compare two datasets obtained through sampling, or to compare a dataset against a synthetic dataset derived from an idealised model (data distribution) describing a population. This test involves proposing a null hypothesis (H₀), which proposes that there is no relationship or difference between the datasets. Conversely, an alternative hypothesis (H₁) suggests that a relationship or difference does exist.

During a hypothesis test, a test statistic is calculated from the observed data and then compared against a pre-defined critical value, such as the significance level (α). If the calculated test statistic exceeds this critical value (or falls into the rejection region), the null hypothesis is deemed to be rejected. This indicates that the observed data are unlikely to have occurred by chance if the null hypothesis were true (we cannot ever say that the null hypothesis is certainly ‘false’. By historical convention, a significance level (α) of 5% (0.05) is commonly used.

A common example is the use of Student’s t-test to compare the means of two samples, such as summarising treatment outcomes from two arms of a clinical trial. Here, the null hypothesis states there is no significant difference between the two sample means (i.e. no treatment effect). The test typically assumes that the distribution of the treatment outcome is normal and has the same variance in each study arm. If the calculated t-value from the observed sample means falls below the predetermined critical value (for a given α and sample size), then the null hypothesis cannot be rejected, and it cannot be concluded that a treatment effect is likely.

You may also be interested in