Statistic Settings

Choose the Statistical Method & Confidence Level

Ablyft supports two scientifically grounded methods for evaluating experiment results: Frequentist and Bayesian. Each comes with unique strengths and interpretations.

1. Overview: Two Statistical Approaches

A/B testing is fundamentally about comparing groups. There are two ways to analyze these comparisons:

  • Frequentist: Based on long-term repeatability and p-values
  • Bayesian: Based on probability distributions over outcomes

2. Key Differences

Aspect Frequentist Bayesian
Focus How likely are the results assuming no effect? How likely is the effect given the results?
Main Metric p-value, Confidence Interval Posterior Probability, Credible Interval
Can interpret probability of hypothesis? ❌ No ✅ Yes
Can peek at results early? ❌ No (can bias results) ✅ Yes (with caution)

3. Frequentist Statistics

In frequentist analysis, a confidence level expresses how confident we are that a confidence interval contains the true effect in repeated sampling.
Confidence = 100% − p-value
A p-value represents the probability of observing an effect this extreme if there were no real difference (null hypothesis).
  • ✅ Good for strict statistical control

4. Bayesian Statistics

In Bayesian analysis, we ask directly:
“What is the probability that B is better than A?”
A credible interval tells you the range where the effect lies with X% probability.
  • ✅ Allows probability statements about improvements
  • ✅ Naturally handles uncertainty and early decisions
  • ✅ Provides posterior probability (e.g. "There’s a 93% chance B is better than A.")

5. Setting the Confidence / Credibility Level

In Ablyft, you can define the level of certainty you want in your conclusions, whether using frequentist or Bayesian analysis.

  • For Frequentist: This controls the confidence interval width and the p-value threshold.
  • For Bayesian: This controls the width of the credible interval and the interpretation of certainty.

You can set a value between 50% and 99.9% depending on your use case:

  • Lower levels → faster results, more risk
  • Higher levels → more certainty, longer run time

6. Example Comparison

Variant A conversion rate: 12.0%
Variant B conversion rate: 13.2%
Metric Frequentist Bayesian
Difference +1.2% +1.2%
Confidence / Credible Interval [+0.4%, +2.0%] [+0.3%, +2.1%]
p-value 0.03
Posterior Probability 93% chance B is better

7. Minimum Conversions

To avoid false positives early in an experiment, we recommend setting a minimum conversion threshold.
Reasonable values might be 100 or 1000 conversions.

8. Summary

  • Frequentist: well-established, precise, but limited in interpretation
  • Bayesian: intuitive, flexible, and great for decision-making under uncertainty

Choose the method that best fits your team’s mindset or combine both to gain a fuller picture.

Further Reading