Statistics for ML #9 — Random Variables
Published:
A random variable is a function that maps outcomes of a random experiment to numbers. It is the bridge between probability theory and data.
Definition
Formally: X : Ω → ℝ, where Ω is the sample space.
- Discrete RV: Takes countable values — {0, 1, 2, …}
- Number of ANC visits, Number of children ever born, Deaths per district
- Continuous RV: Takes any value in an interval
- Birth weight (grams), Height (cm), Income (USD), Temperature
Notation
- X (capital) = the random variable (the process)
- x (lowercase) = a specific realised value
- P(X = x) = probability that X takes value x
Functions of Random Variables
If X ~ some distribution, what is the distribution of Y = g(X)?
- Y = 2X + 3 (linear transformation)
- Y = X² (non-linear)
- Y = ln(X) (log transformation — often used for right-skewed RVs)
Key property: E[aX + b] = aE[X] + b ; Var[aX + b] = a²Var[X]
Joint Random Variables
Two RVs X and Y are jointly distributed — described by their joint PMF/PDF f(x,y).
Marginal distribution of X: f_X(x) = Σ_y f(x,y) or ∫f(x,y)dy
Why This Matters for ML
- Loss function minimisation: We minimise E[L(Y, Ŷ)] over the distribution of (X,Y)
- Regularisation: Adding a prior on weights is treating them as random variables
- Generative models: Learn the joint P(X,Y) and sample new data from it
- Uncertainty quantification: Treat predictions as distributions, not point estimates
import numpy as np
from scipy import stats
# Simulate a random variable
rng = np.random.default_rng(42)
X = rng.normal(loc=3200, scale=500, size=10000) # birth weight distribution
print(f"E[X] = {X.mean():.1f}g")
print(f"Var[X] = {X.var():.1f}")
print(f"P(X < 2500) = {(X < 2500).mean():.3f}") # low birth weight probability
Previous: #8 Bayes Theorem | Next: #10 PMF
