Statistics for ML #8 — Bayes Theorem

1 minute read

Published:

Bayes’ Theorem is the mathematical foundation of rational belief update. It is arguably the most important equation in statistics and modern ML.

The Formula

\[P(H|E) = \frac{P(E|H) \cdot P(H)}{P(E)}\]
TermNameMeaning
P(H|E)PosteriorUpdated belief after seeing evidence
P(E|H)LikelihoodHow probable is the evidence if H is true
P(H)PriorInitial belief before seeing evidence
P(E)Marginal likelihoodNormalising constant (often intractable)

Medical Diagnosis Example

A disease affects 1% of the population. A test has 99% sensitivity and 95% specificity.
If a patient tests positive, what is the probability they have the disease?

\[P(\text{Disease}|\text{Test+}) = \frac{0.99 \times 0.01}{0.99 \times 0.01 + 0.05 \times 0.99} = \frac{0.0099}{0.0594} \approx 16.7\%\]

Despite a 99% sensitive test, only 1-in-6 positives actually have the disease when prevalence is low. This is the base rate fallacy — ignoring the prior.

Extended Form (Law of Total Probability in denominator)

\[P(H|E) = \frac{P(E|H) \cdot P(H)}{\sum_k P(E|H_k) \cdot P(H_k)}\]

Bayesian Updating: Sequential Learning

Start with prior → observe data → compute posterior → use posterior as new prior → observe more data → …

This is exactly how online learning and Bayesian neural networks work.

Bayes in ML

  • Naive Bayes classifier: Applies Bayes with conditional independence assumption
  • Bayesian optimisation: For hyperparameter tuning
  • Bayesian neural networks: Distributions over weights, not point estimates
  • MAP estimation: Maximum A Posteriori = MLE + prior regularisation
from sklearn.naive_bayes import GaussianNB, BernoulliNB

# Naive Bayes for classification
gnb = GaussianNB(priors=[0.3, 0.7])  # set class priors
gnb.fit(X_train, y_train)
probs = gnb.predict_proba(X_test)
# Bayesian updating in R
library(bayesrules)
# Prior: Beta(2,5), Likelihood: Binomial
# Posterior: Beta(2+k, 5+n-k)
plot_beta_binomial(alpha=2, beta=5, y=14, n=20)

Previous: #7 Conditional Probability | Next: #9 Random Variables