A/B Test Sample Size Calculator

Calculate the sample size needed for statistically significant A/B test results

Quick Start Scenarios

Conversion Metrics

%

Your current conversion rate

%

Relative improvement you want to detect

Traffic to your test page per day

Statistical Settings

Confidence that result isn't due to chance

Probability of detecting a real effect

Control + treatment variations

Sample per Variation
31.2K
Total Sample
62.4K
Test Duration
63 days
Expected Lift
5.00% → 5.50%
Share:

Test Interpretation

i
What This Means
You need 31.2K visitors per variation to detect a 10% relative improvement (from 5% to 5.50%) with 95% confidence.
!
Long Duration Warning
This test will take 63 days (9 weeks). Consider: increasing MDE, focusing on higher-traffic pages, or running 80% significance tests.
Important Caveats
  • Don't stop the test early even if results look significant
  • Avoid testing during holidays or unusual traffic periods
  • Run tests for at least one full business cycle (usually 1-2 weeks)

MDE Sensitivity Analysis

How sample size changes with different minimum detectable effects:

MDESample per VariationTotal SampleDuration
5%122.0K244.0K244 days
10%(selected)31.2K62.4K63 days
15%14.2K28.4K29 days
20%8.2K16.3K17 days
25%5.3K10.7K11 days

Smaller MDEs require larger samples but detect subtler improvements.

Understanding the Numbers

Statistical Significance (95%)

The probability that an observed effect is real, not random chance. 95% means only 5% chance of a false positive (Type I error).

Statistical Power (80%)

The probability of detecting a real effect when it exists. 80% power means 20% chance of missing a real improvement (Type II error).

Minimum Detectable Effect

The smallest relative improvement your test can reliably detect.10% MDE means detecting a change from 5% to 5.50%.

Why Sample Size Matters

Too small = unreliable results with high error rates. Too large = wasted time and resources. This calculator finds the optimal balance.

Need More Traffic?

Your test would take over 2 months. Learn strategies to accelerate experimentation.

Get Monthly Runway Checkups

Free monthly email with runway tips, benchmarks, and founder insights. No spam, unsubscribe anytime.

What is A/B testing sample size?

Sample size in A/B testing is the number of visitors (or users) needed in each test variation to achieve statistically reliable results. Running a test with too few visitors leads to inconclusive or misleading results, while too many visitors wastes time that could be spent running additional tests.

The required sample size depends on four key factors: your baseline conversion rate, the minimum effect you want to detect (MDE), your desired statistical significance (typically 95%), and statistical power (typically 80%). A 5% baseline conversion with 10% MDE requires roughly 31,000 visitors per variation at standard settings.

This calculator uses the two-proportion z-test formula, the standard method for comparing conversion rates between two independent groups. It accounts for the variance in both control and treatment groups to determine the sample needed to detect your specified effect.

How to choose your minimum detectable effect

Start with business impact. Ask yourself: what's the smallest improvement that would matter? If a 5% lift in conversion translates to $10,000/month in revenue, that's worth detecting. If it only means $100/month, you probably need to detect larger effects to justify the testing time.

Consider your traffic constraints. Lower MDEs require exponentially more traffic. Detecting a 5% relative improvement needs about 4x the sample size of detecting a 10% improvement. If you only have 1,000 daily visitors, targeting 20-30% MDEs is more realistic than 5%.

The 10-20% rule of thumb. Most practitioners use 10-20% relative MDE as a reasonable balance. This catches meaningful improvements without requiring months of testing. For high-traffic sites, 5-10% MDE becomes feasible. For low-traffic sites, 20-30% may be necessary.

Common A/B testing mistakes to avoid

Peeking at results early. The single most common mistake. Looking at results before reaching your calculated sample size and stopping when you see “significance” inflates your false positive rate from 5% to as high as 30%. Wait for your full sample or use sequential testing methods.

Running tests for calendar time, not sample size.“Let's run this for two weeks” ignores whether you've reached adequate sample. Use this calculator to determine how many visitors you need, then calculate how long that takes at your traffic levels. The test ends when you hit sample, not a date.

Testing too many variations. Each additional variation multiplies your required sample. A 4-variation test needs 4x the traffic of an A/B test, and requires correction for multiple comparisons. Start with simple A/B tests; only add variations when you have the traffic to support them.

Ignoring novelty effects. New designs often win initially because they're different, not better. Run tests for at least one full business cycle (typically 1-2 weeks minimum) to ensure you're not just measuring the excitement of change.

What to do when you lack traffic

Focus on high-impact pages. Test your highest-traffic pages first: homepage, pricing page, main landing pages. These give you the fastest path to significant results. Save low-traffic pages for after you've optimized the high-traffic ones.

Test bigger changes. Small tweaks like button colors need huge samples to detect. Test substantial changes - different value propositions, page layouts, pricing structures - that could produce 20-50% lifts. These require smaller samples and teach you more about your customers.

Use Bayesian methods for faster decisions.Traditional frequentist statistics (what this calculator uses) require fixed sample sizes. Bayesian methods let you monitor results continuously and stop when you have sufficient confidence. Tools like Optimizely use this approach for low-traffic scenarios.

Consider qualitative research instead.If a test would take 3+ months, user interviews and usability testing might give you faster insights. Talk to 10 customers about their experience, watch session recordings, and make informed decisions without waiting for statistical significance.

Built by StartVest — Free tools for startup founders

AI-readable version