Statistics for ML #99 — Causal Inference: DAGs, Do-Calculus

1 minute read

Published:

Causal Inference: DAGs, Do-Calculus

Post #99/100 in the Statistics for ML series — Md Salek MiahStatistician & ML ResearcherSUST, Bangladesh.

Causal Inference moves beyond correlation to answer: “Does X cause Y?” This is the central question in public health policy.

DAGs (Directed Acyclic Graphs)

A DAG encodes causal assumptions. Arrows represent direct causal effects. Used to identify confounders, mediators, and colliders.

Tools: Dagitty (web-based), dagitty R package — used in our DHS research for confounder selection.

Do-Calculus (Pearl’s Framework)

\(P(Y | do(X=x)) \neq P(Y | X=x)\)

The do-operator represents intervening to set X, vs observing X.

Key Concepts

ConceptDescriptionExample
ConfounderCauses both X and YWealth → ANC visits AND delivery location
MediatorOn causal path X→M→YANC → knowledge → SBA
ColliderCaused by both X and YConditioning on collider opens non-causal path
library(dagitty)
library(ggdag)

# DAG for SBA research
dag <- dagitty('dag {
  Wealth -> ANC_visits
  Wealth -> SBA
  ANC_visits -> SBA
  Education -> Wealth
  Education -> SBA
  Rural -> ANC_visits
  Rural -> SBA
}')

# Find minimal adjustment set for ANC -> SBA
adjustmentSets(dag, exposure="ANC_visits", outcome="SBA")
ggdag(dag, layout="circle") + theme_dag()

Series Index | Post #99/100 | Md Salek Miah | saleksta@gmail.com