A/B Test Sample Size Calculator
Calculate the sample size needed for statistically significant A/B test results
Quick Start Scenarios
Conversion Metrics
Your current conversion rate
Relative improvement you want to detect
Traffic to your test page per day
Statistical Settings
Confidence that result isn't due to chance
Probability of detecting a real effect
Control + treatment variations
Test Interpretation
- Don't stop the test early even if results look significant
- Avoid testing during holidays or unusual traffic periods
- Run tests for at least one full business cycle (usually 1-2 weeks)
MDE Sensitivity Analysis
How sample size changes with different minimum detectable effects:
| MDE | Sample per Variation | Total Sample | Duration |
|---|---|---|---|
| 5% | 122.0K | 244.0K | 244 days |
| 10%(selected) | 31.2K | 62.4K | 63 days |
| 15% | 14.2K | 28.4K | 29 days |
| 20% | 8.2K | 16.3K | 17 days |
| 25% | 5.3K | 10.7K | 11 days |
Smaller MDEs require larger samples but detect subtler improvements.
Understanding the Numbers
Statistical Significance (95%)
The probability that an observed effect is real, not random chance. 95% means only 5% chance of a false positive (Type I error).
Statistical Power (80%)
The probability of detecting a real effect when it exists. 80% power means 20% chance of missing a real improvement (Type II error).
Minimum Detectable Effect
The smallest relative improvement your test can reliably detect.10% MDE means detecting a change from 5% to 5.50%.
Why Sample Size Matters
Too small = unreliable results with high error rates. Too large = wasted time and resources. This calculator finds the optimal balance.
Need More Traffic?
Your test would take over 2 months. Learn strategies to accelerate experimentation.
Get Monthly Runway Checkups
Free monthly email with runway tips, benchmarks, and founder insights. No spam, unsubscribe anytime.
What is A/B testing sample size?
Sample size in A/B testing is the number of visitors (or users) needed in each test variation to achieve statistically reliable results. Running a test with too few visitors leads to inconclusive or misleading results, while too many visitors wastes time that could be spent running additional tests.
The required sample size depends on four key factors: your baseline conversion rate, the minimum effect you want to detect (MDE), your desired statistical significance (typically 95%), and statistical power (typically 80%). A 5% baseline conversion with 10% MDE requires roughly 31,000 visitors per variation at standard settings.
This calculator uses the two-proportion z-test formula, the standard method for comparing conversion rates between two independent groups. It accounts for the variance in both control and treatment groups to determine the sample needed to detect your specified effect.
How to choose your minimum detectable effect
Start with business impact. Ask yourself: what's the smallest improvement that would matter? If a 5% lift in conversion translates to $10,000/month in revenue, that's worth detecting. If it only means $100/month, you probably need to detect larger effects to justify the testing time.
Consider your traffic constraints. Lower MDEs require exponentially more traffic. Detecting a 5% relative improvement needs about 4x the sample size of detecting a 10% improvement. If you only have 1,000 daily visitors, targeting 20-30% MDEs is more realistic than 5%.
The 10-20% rule of thumb. Most practitioners use 10-20% relative MDE as a reasonable balance. This catches meaningful improvements without requiring months of testing. For high-traffic sites, 5-10% MDE becomes feasible. For low-traffic sites, 20-30% may be necessary.
Common A/B testing mistakes to avoid
Peeking at results early. The single most common mistake. Looking at results before reaching your calculated sample size and stopping when you see “significance” inflates your false positive rate from 5% to as high as 30%. Wait for your full sample or use sequential testing methods.
Running tests for calendar time, not sample size.“Let's run this for two weeks” ignores whether you've reached adequate sample. Use this calculator to determine how many visitors you need, then calculate how long that takes at your traffic levels. The test ends when you hit sample, not a date.
Testing too many variations. Each additional variation multiplies your required sample. A 4-variation test needs 4x the traffic of an A/B test, and requires correction for multiple comparisons. Start with simple A/B tests; only add variations when you have the traffic to support them.
Ignoring novelty effects. New designs often win initially because they're different, not better. Run tests for at least one full business cycle (typically 1-2 weeks minimum) to ensure you're not just measuring the excitement of change.
What to do when you lack traffic
Focus on high-impact pages. Test your highest-traffic pages first: homepage, pricing page, main landing pages. These give you the fastest path to significant results. Save low-traffic pages for after you've optimized the high-traffic ones.
Test bigger changes. Small tweaks like button colors need huge samples to detect. Test substantial changes - different value propositions, page layouts, pricing structures - that could produce 20-50% lifts. These require smaller samples and teach you more about your customers.
Use Bayesian methods for faster decisions.Traditional frequentist statistics (what this calculator uses) require fixed sample sizes. Bayesian methods let you monitor results continuously and stop when you have sufficient confidence. Tools like Optimizely use this approach for low-traffic scenarios.
Consider qualitative research instead.If a test would take 3+ months, user interviews and usability testing might give you faster insights. Talk to 10 customers about their experience, watch session recordings, and make informed decisions without waiting for statistical significance.
More Free Startup Tools
Part of the StartVest toolkit for founders and product managers. All free, no signup required.
Create User Personas
Build detailed user personas for your product
Roast Your Startup Idea
Get brutally honest AI feedback on your startup idea
Prioritize Features (RICE)
Score and rank features using the RICE framework
Calculate Runway
Calculate how long your startup funding will last
Write Release Notes
Generate professional release notes from commit logs
Plan Your Roadmap
Create visual product roadmaps for stakeholders
Generate PRDs
Create detailed product requirement documents
Launch Checklist
Complete checklist for product launches
Generate Changelogs
Create user-friendly changelog entries
Write Demo Scripts
Generate compelling product demo scripts
Built by StartVest — Free tools for startup founders