Statistics for ML #9 — Random Variables

1 minute read

Published: January 09, 2026

A random variable is a function that maps outcomes of a random experiment to numbers. It is the bridge between probability theory and data.

Definition

Formally: X : Ω → ℝ, where Ω is the sample space.

Discrete RV: Takes countable values — {0, 1, 2, …}
- Number of ANC visits, Number of children ever born, Deaths per district
Continuous RV: Takes any value in an interval
- Birth weight (grams), Height (cm), Income (USD), Temperature

Notation

X (capital) = the random variable (the process)
x (lowercase) = a specific realised value
P(X = x) = probability that X takes value x

Functions of Random Variables

If X ~ some distribution, what is the distribution of Y = g(X)?

Y = 2X + 3 (linear transformation)
Y = X² (non-linear)
Y = ln(X) (log transformation — often used for right-skewed RVs)

Key property: E[aX + b] = aE[X] + b ; Var[aX + b] = a²Var[X]

Joint Random Variables

Two RVs X and Y are jointly distributed — described by their joint PMF/PDF f(x,y).

Marginal distribution of X: f_X(x) = Σ_y f(x,y) or ∫f(x,y)dy

Why This Matters for ML

Loss function minimisation: We minimise E[L(Y, Ŷ)] over the distribution of (X,Y)
Regularisation: Adding a prior on weights is treating them as random variables
Generative models: Learn the joint P(X,Y) and sample new data from it
Uncertainty quantification: Treat predictions as distributions, not point estimates

import numpy as np
from scipy import stats

# Simulate a random variable
rng = np.random.default_rng(42)
X = rng.normal(loc=3200, scale=500, size=10000)  # birth weight distribution

print(f"E[X] = {X.mean():.1f}g")
print(f"Var[X] = {X.var():.1f}")
print(f"P(X < 2500) = {(X < 2500).mean():.3f}")  # low birth weight probability

Previous: #8 Bayes Theorem | Next: #10 PMF

Share on

Bluesky Facebook LinkedIn Mastodon X (formerly Twitter)

Md Salek Miah

Statistics for ML #9 — Random Variables

Definition

Notation

Functions of Random Variables

Joint Random Variables

Why This Matters for ML

Share on

You May Also Enjoy

Future Blog Post

Statistics for ML #97 — Time Series Analysis: ARIMA, ACF, PACF

Time Series Analysis: ARIMA, ACF, PACF

Statistics for ML #96 — Autoencoders & VAE

Autoencoders & VAE

Statistics for ML #95 — Vanishing & Exploding Gradients

Vanishing & Exploding Gradients