Statistics for ML #71 — Singular Value Decomposition (SVD)

less than 1 minute read

Published:

Singular Value Decomposition (SVD)

Post #71/100 in the Statistics for ML series — Md Salek MiahStatistician & ML ResearcherSUST, Bangladesh.

Singular Value Decomposition (SVD) decomposes any matrix A into: \(A = U\Sigma V^T\) where U = left singular vectors, Σ = diagonal matrix of singular values, V = right singular vectors.

SVD is the foundation of PCA, recommender systems, and NLP.

Applications in ML & Public Health

  • Dimensionality reduction: Keep top-k singular values → low-rank approximation
  • Missing data imputation: Matrix completion via SVD (e.g., imputing missing DHS variables)
  • Recommender systems: My MovieLens project uses SVD for collaborative filtering
  • Latent Semantic Analysis: Document-term matrix decomposition
import numpy as np
from sklearn.decomposition import TruncatedSVD

# SVD on DHS health indicator matrix (districts × indicators)
X = df[health_indicators].fillna(0).values
svd = TruncatedSVD(n_components=5, random_state=42)
X_reduced = svd.fit_transform(X)

explained_var = svd.explained_variance_ratio_.cumsum()
print(f'Top 5 components explain {explained_var[-1]*100:.1f}% of variance')

Series Index | Post #71/100 | Md Salek Miah | saleksta@gmail.com