Statistics for ML #9 — Random Variables

1 minute read

Published:

A random variable is a function that maps outcomes of a random experiment to numbers. It is the bridge between probability theory and data.

Definition

Formally: X : Ω → ℝ, where Ω is the sample space.

  • Discrete RV: Takes countable values — {0, 1, 2, …}
    • Number of ANC visits, Number of children ever born, Deaths per district
  • Continuous RV: Takes any value in an interval
    • Birth weight (grams), Height (cm), Income (USD), Temperature

Notation

  • X (capital) = the random variable (the process)
  • x (lowercase) = a specific realised value
  • P(X = x) = probability that X takes value x

Functions of Random Variables

If X ~ some distribution, what is the distribution of Y = g(X)?

  • Y = 2X + 3 (linear transformation)
  • Y = X² (non-linear)
  • Y = ln(X) (log transformation — often used for right-skewed RVs)

Key property: E[aX + b] = aE[X] + b ; Var[aX + b] = a²Var[X]

Joint Random Variables

Two RVs X and Y are jointly distributed — described by their joint PMF/PDF f(x,y).

Marginal distribution of X: f_X(x) = Σ_y f(x,y) or ∫f(x,y)dy

Why This Matters for ML

  • Loss function minimisation: We minimise E[L(Y, Ŷ)] over the distribution of (X,Y)
  • Regularisation: Adding a prior on weights is treating them as random variables
  • Generative models: Learn the joint P(X,Y) and sample new data from it
  • Uncertainty quantification: Treat predictions as distributions, not point estimates
import numpy as np
from scipy import stats

# Simulate a random variable
rng = np.random.default_rng(42)
X = rng.normal(loc=3200, scale=500, size=10000)  # birth weight distribution

print(f"E[X] = {X.mean():.1f}g")
print(f"Var[X] = {X.var():.1f}")
print(f"P(X < 2500) = {(X < 2500).mean():.3f}")  # low birth weight probability

Previous: #8 Bayes Theorem | Next: #10 PMF