Sample Ratio Mismatch Calculator

Check if your A/B test traffic was correctly distributed

Observed counts

Enter numbers for each variant, separated by commas, spaces, or any non-numeric character.

Please enter at least two numbers.

Expected distribution

Equal split Custom ratio

Enter the expected ratio for each variant.

Number of ratios must match the number of observed counts.

No significant Sample Ratio Mismatch

p-value

Chi-square

Observed	Expected	Difference

Sample Ratio Mismatch (SRM) occurs when the distribution of visitors between variants significantly differs from the expected distribution. A p-value less than 0.05 typically indicates a mismatch worth investigating.

Context

Picture this: you set up your experiment to have 50% of users on variant A and 50% on variant B. The experiment is over and you have 44,000 users in A, and 45,000 in B. Is this a problem?

Most likely. If your assignment worked properly, the probability of seeing this imbalance (or a larger one) is less than 0.1%.

When the difference between the ratios is significant, you have a Sample Ratio Mismatch (SRM). Unless you understand why it happened, you should not analyse the results of the experiment; your setup may be flawed, invalidating any conclusions.

SRM can be detected using a statistical test: the Chi-Squared Goodness-of-Fit. It compares the actual frequencies (44,000 and 45,000 in the example) to the expected frequencies (usually an even distribution: 44,500 and 44,500).

When you run a chi-squared test, you get a chi-square value. The chi-square value is a positive number that indicates how large of a mismatch there is between reality and expectations.

We can use the chi-square value to get a p-value.

The p-value is a number between 0 and 1 that indicates how surprised you should be to see your distribution (or one even more extreme) if your experiment setup was working correctly. The smaller the p-value, the more surprised you should be.

A perfectly even distribution gets a p-value of 1—you shouldn't be surprised at all. A p-value of 0.50 means this imbalance would happen about 50% of the time by chance alone; nothing to worry about. A p-value of 0.0008 (like in our 44,000 vs 45,000 example) means this would happen less than 0.1% of the time; it's unlikely to be random.

p-values below 0.05 are evidence of a potential SRM issue, and values below 0.01 are strong evidence that something is wrong with your experiment setup.

Calculations

When you enter two or more observed frequencies (i.e. how many users/samples landed on each variant), the calculator does the following:

Calculates expected frequencies based on your distribution choice:
- For equal distribution: each variant gets the same expected count (total samples ÷ number of variants)
- For custom ratio: your specified ratio is normalised (so it sums to 1) and multiplied by the total sample count
Computes the chi-square statistic using the formula:
$χ^{2} = \sum [\frac{{(O - E)}^{2}}{E}]$

where $O$ is the observed count for each variant and $E$ is the expected count
Determines degrees of freedom as $(n - 1)$ , where $n$ is the number of variants
Calculates the p-value from the chi-square statistic and degrees of freedom using the regularised incomplete gamma function:
$p-value = 1 - P (\frac{df}{2}, \frac{χ^{2}}{2})$

This involves two numerical approximation techniques:
- For smaller chi-square values: series expansion method
- For larger values: continued fraction method
Interprets the result:
- p < 0.01: strong evidence of sample ratio mismatch
- 0.01 ≤ p < 0.05: possible sample ratio mismatch
- p ≥ 0.05: no evidence of sample ratio mismatch

All calculations are done locally; your data never leaves your device.

The calculator has been tested by comparing its results to that of established statistical packages in Python (stats from scipy). You can verify your device returns the expected numbers, within a 0.01% margin of error, by adding ?test to this calculator's URL.

Óscar’s A/B Testing Toolkit

Sample Ratio Mismatch (SRM) Tips

How This Works

Context

Calculations