Statistics for ML #71 — Singular Value Decomposition (SVD)
Published:
Singular Value Decomposition (SVD)
| Post #71/100 in the Statistics for ML series — Md Salek Miah | Statistician & ML Researcher | SUST, Bangladesh. |
Singular Value Decomposition (SVD) decomposes any matrix A into: \(A = U\Sigma V^T\) where U = left singular vectors, Σ = diagonal matrix of singular values, V = right singular vectors.
SVD is the foundation of PCA, recommender systems, and NLP.
Applications in ML & Public Health
- Dimensionality reduction: Keep top-k singular values → low-rank approximation
- Missing data imputation: Matrix completion via SVD (e.g., imputing missing DHS variables)
- Recommender systems: My MovieLens project uses SVD for collaborative filtering
- Latent Semantic Analysis: Document-term matrix decomposition
import numpy as np
from sklearn.decomposition import TruncatedSVD
# SVD on DHS health indicator matrix (districts × indicators)
X = df[health_indicators].fillna(0).values
svd = TruncatedSVD(n_components=5, random_state=42)
X_reduced = svd.fit_transform(X)
explained_var = svd.explained_variance_ratio_.cumsum()
print(f'Top 5 components explain {explained_var[-1]*100:.1f}% of variance')
Series Index | Post #71/100 | Md Salek Miah | saleksta@gmail.com
