The Cliff Edge

Discovering Causal Effects at the Threshold

Health Economics and Policy

Spring 2026

Life on the Edge

Why Cutoffs Can Reveal Causality

The Day You Turn 65, You Become Healthier

The Puzzle

On the day an American turns 65, their probability of having a doctor visit jumps by roughly 10 percentage points, their out-of-pocket costs drop, and their health outcomes measurably improve.

But biologically, nothing changes overnight. A 64-year-old and a 65-year-old are nearly identical.

So what happened?

Medicare happened. And this simple fact — that a bureaucratic rule creates an abrupt change in treatment at an arbitrary age — is the engine behind one of the most credible causal inference designs in all of social science.

“Nature doesn’t create cliff edges. Policy does. And at those edges, we find causal effects.”

Policy Creates Cliff Edges Everywhere

Medicare at 65 is just one example. Policy creates cliff edges everywhere:

  • Age 21: Legally purchase alcohol
  • Income below $X: Qualify for Medicaid
  • GPA ≥ 3.0: Make the Dean’s List
  • Birth weight < 1500g: Admission to NICU
  • Vote share > 50%: Win the election

The Key Insight

When rules force treatment assignment at a cutoff, we get a natural experiment. Units just above and below the threshold are essentially comparable—except one group gets treated.

Pictures Reveal What Statistics Summarize

Each of the following slides presents a different RDD scenario.

  • What do you observe in each case?
  • What questions arise about the identification strategy?
  • How do these patterns inform whether an RDD assumption is valid?

RDD in Pictures: The Alcohol Example

Question: What is age doing here? What’s the story being told?

RDD in Pictures: Binned Means

Question: What do you think each dot represents? What do you think each line represents? What’s the story being told?

RDD in Pictures: The Treatment Effect

Question: Where is the treatment effect on the graph? Treatment effect of what? Give a rough approximation of the treatment effect.

The Jump Tells the Causal Story

Show R Code
# Generate RDD example data
n <- 500
rdd_example <- tibble(
  x = runif(n, -1, 1),
  treated = ifelse(x >= 0, 1, 0),
  # True effect of 3
  y = 2 + 0.8 * x + 3 * treated + rnorm(n, 0, 0.5)
)

ggplot(rdd_example, aes(x = x, y = y, color = factor(treated))) +
  geom_point(alpha = 0.5, size = 2) +
  geom_vline(xintercept = 0, linetype = "dashed", color = slate_gray, linewidth = 1) +
  geom_smooth(method = "lm", se = FALSE, linewidth = 1.5) +
  scale_color_manual(
    values = c("0" = accent_coral, "1" = secondary_teal),
    labels = c("Below cutoff (Control)", "Above cutoff (Treated)")
  ) +
  annotate("segment", x = 0, xend = 0, y = 2, yend = 5,
           arrow = arrow(ends = "both", length = unit(0.1, "inches")),
           color = primary_blue, linewidth = 1.5) +
  annotate("text", x = 0.12, y = 3.5, label = "Treatment\nEffect",
           color = primary_blue, fontface = "bold", size = 5, hjust = 0) +
  labs(
    title = "Regression Discontinuity in Action",
    subtitle = "The jump at the cutoff identifies the causal effect",
    x = "Running Variable (centered at cutoff)",
    y = "Outcome",
    color = NULL
  ) +
  theme_health_econ() +
  theme(legend.position = "bottom") +
  coord_cartesian(ylim = c(0, 7))

Figure 1: The core intuition: a jump at the cutoff reveals the treatment effect

RDD Was Invented, Forgotten, and Reborn

Invented (1960)

  • Donald Campbell, educational psychologist
  • Used to evaluate scholarship programs
  • Then… largely forgotten

Rediscovered (1999-2000)

  • Josh Angrist & Victor Lavy (class size)
  • Sandra Black (school quality)
  • Now one of the most credible quasi-experimental designs

Why the revival?

  1. Data availability: Administrative records with precise running variables
  2. Computing power: Modern bandwidth selection methods
  3. Credibility: More believable than IV or matching for many applications
  4. Visual appeal: You can literally see the treatment effect

Every Design Needs Its Language

Term Definition Example
Running variable The continuous score determining treatment Test score, vote share, age
Cutoff (threshold) The value where treatment switches 65 for Medicare, 50% for elections
Discontinuity (jump) The break in outcomes at the cutoff Sudden change in mortality at age 65
Bandwidth Window around the cutoff used for estimation ±5 points from the threshold

Graduate Students: Local vs. Global Estimation

RDD can use data close to the cutoff (local polynomial) or across the entire running variable range (global polynomial). Local methods are now preferred because they’re more robust to functional form assumptions.

The Twins Test

The RDD Golden Rule

Ask yourself: “If two people score just above and just below the cutoff, are they like twins — identical in every way except that one crossed the line?”

If the answer is yes, you have a good RDD. The cutoff acts like a coin flip for people near it.

The Twins Test checks three things:

  1. Can people choose which side they land on? (If yes, the twins aren’t random.)
  2. Does anything else change at the cutoff? (If yes, the jump isn’t just treatment.)
  3. Are the twins actually close enough to compare? (If no, you need a narrower bandwidth.)

Just as the IV “Grandma Test” asks “Would this instrument puzzle a non-expert?”, the Twins Test asks “Would these two people be indistinguishable without the cutoff rule?”

From Concept to Classification

Now that we understand why cliff edges create natural experiments, we need to distinguish between two kinds. Sometimes the edge is absolute — cross the line and treatment is guaranteed. Other times, the edge is softer — crossing makes treatment more likely but not certain.

This distinction shapes everything about how we estimate the effect.

From Intuition to Design Types

Figure 2

Two Kinds of Cliff Edge

Sharp and Fuzzy Assignment Rules

Two Flavors of Discontinuity

Show R Code
# Sharp RDD
sharp_data <- tibble(
  x = seq(-1, 1, length.out = 200),
  treatment_prob = ifelse(x >= 0, 1, 0)
)

p_sharp <- ggplot(sharp_data, aes(x = x, y = treatment_prob)) +
  geom_line(linewidth = 2, color = secondary_teal) +
  geom_vline(xintercept = 0, linetype = "dashed", color = slate_gray) +
  labs(
    title = "Sharp RDD",
    subtitle = "Treatment is deterministic at cutoff",
    x = "Running Variable",
    y = "P(Treatment)"
  ) +
  theme_health_econ() +
  scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
  annotate("text", x = -0.5, y = 0.1, label = "No one treated",
           color = accent_coral, fontface = "bold", size = 4) +
  annotate("text", x = 0.5, y = 0.9, label = "Everyone treated",
           color = secondary_teal, fontface = "bold", size = 4)

# Fuzzy RDD
fuzzy_prob <- function(x) {
  plogis(3 * x) * 0.6 + ifelse(x >= 0, 0.3, 0)
}

fuzzy_data <- tibble(
  x = seq(-1, 1, length.out = 200),
  treatment_prob = fuzzy_prob(x)
)

fuzzy_jump_left <- fuzzy_prob(-1e-6)
fuzzy_jump_right <- fuzzy_prob(1e-6)

p_fuzzy <- ggplot(fuzzy_data, aes(x = x, y = treatment_prob)) +
  geom_line(linewidth = 2, color = secondary_teal) +
  geom_vline(xintercept = 0, linetype = "dashed", color = slate_gray) +
  labs(
    title = "Fuzzy RDD",
    subtitle = "Treatment probability jumps but isn't deterministic",
    x = "Running Variable",
    y = "P(Treatment)"
  ) +
  theme_health_econ() +
  scale_y_continuous(limits = c(0, 1), labels = scales::percent) +
  annotate("segment", x = 0, xend = 0, y = fuzzy_jump_left, yend = fuzzy_jump_right,
           arrow = arrow(ends = "both", length = unit(0.1, "inches")),
           color = primary_blue, linewidth = 1.5) +
  annotate("text", x = 0.2, y = 0.5, label = "Jump",
           color = primary_blue, fontface = "bold", size = 4)

p_sharp + p_fuzzy

Figure 3: Sharp RDD: treatment jumps from 0 to 1. Fuzzy RDD: treatment probability jumps but not completely.

Sharp RDD Means the Rule Is Absolute

Definition: Treatment is a deterministic function of the running variable

\[ D_i = \begin{cases} 1 & \text{if } X_i \ge c_0 \\ 0 & \text{if } X_i < c_0 \end{cases} \]

Examples in Health Economics:

  • Medicare eligibility at age 65
  • Medicaid income thresholds
  • Hospital quality ratings with discrete cutoffs

Note

In sharp RDD, knowing \(X_i\) tells you treatment status with certainty. No IV needed—just compare outcomes above vs. below the cutoff.

Fuzzy RDD Means the Rule Leaks

Definition: Treatment probability jumps at the cutoff, but not from 0 to 1

\[ \lim_{x \downarrow c_0} P(D_i = 1 | X_i = x) \neq \lim_{x \uparrow c_0} P(D_i = 1 | X_i = x) \]

Examples:

  • SAT cutoff for college admission (some below get in, some above don’t)
  • Union certification elections (not all certified workplaces unionize)
  • Recommendation for surgery based on BMI threshold

Graduate Students: Fuzzy RDD as IV

Fuzzy RDD uses the cutoff as an instrument. The estimand is: \[\tau_{\text{Fuzzy}} = \frac{\text{Jump in outcome at cutoff}}{\text{Jump in treatment probability at cutoff}}\] This is a local average treatment effect (LATE) for compliers at the cutoff.

From Description to Identification

We can now recognize an RDD when we see one, and we know whether it’s sharp or fuzzy. We can apply the Twins Test to check if the design is credible.

But why does comparing twins at the edge give us a causal effect? The answer lies in a single, elegant assumption.

From Design to Identification

Figure 4

Why the Jump Is Causal

Continuity and Local Comparability

Everything Except Treatment Must Be Smooth

What we assume:

The potential outcomes \(E[Y^0 | X]\) and \(E[Y^1 | X]\) are continuous (smooth) at the cutoff.

What this means:

  • Without treatment, outcomes would change gradually across the threshold
  • Any jump in observed outcomes must be due to treatment
  • Confounders also vary smoothly (no sudden changes at the cutoff)

Graduate Students: Formal Smoothness Condition

\(\lim_{x \uparrow c_0} E[Y^j | X = x] = \lim_{x \downarrow c_0} E[Y^j | X = x]\) for \(j \in \{0, 1\}\). Both conditional expectation functions must be continuous at \(c_0\).

Show R Code
smooth_data <- tibble(
  x = seq(-1, 1, length.out = 200),
  y0 = 2 + 0.8 * x,
  y1 = 2 + 0.8 * x + 2.5
)

ggplot(smooth_data) +
  geom_line(aes(x = x, y = y0), color = accent_coral, linewidth = 1.5, linetype = "dashed") +
  geom_line(aes(x = x, y = y1), color = secondary_teal, linewidth = 1.5, linetype = "dashed") +
  geom_vline(xintercept = 0, linetype = "dotted", color = slate_gray) +
  annotate("text", x = -0.7, y = 1.5, label = "E[Y⁰|X]",
           color = accent_coral, size = 5) +
  annotate("text", x = 0.7, y = 5, label = "E[Y¹|X]",
           color = secondary_teal, size = 5) +
  labs(
    title = "Smoothness Assumption",
    subtitle = "Both potential outcomes are continuous at the cutoff",
    x = "Running Variable", y = "Outcome"
  ) +
  theme_health_econ()
Figure 5

Smoothness vs. Treatment: A Visual Comparison

Show R Code
set.seed(6320)
n_sv <- 300
sv_data <- tibble(
  x = runif(n_sv, -1, 1),
  y0 = 2 + 0.8 * x + rnorm(n_sv, 0, 0.4),
  y1 = 2 + 0.8 * x + 2.5 + rnorm(n_sv, 0, 0.4),
  treated = ifelse(x >= 0, 1, 0),
  y_obs = ifelse(x >= 0, y1, y0)
)

# Left panel: Potential outcomes (both visible, with scatter)
p_left <- ggplot(sv_data) +
  geom_point(aes(x = x, y = y0), color = accent_coral, alpha = 0.3, size = 1.5) +
  geom_point(aes(x = x, y = y1), color = secondary_teal, alpha = 0.3, size = 1.5) +
  geom_smooth(aes(x = x, y = y0), method = "lm", se = FALSE,
              color = accent_coral, linewidth = 1.5) +
  geom_smooth(aes(x = x, y = y1), method = "lm", se = FALSE,
              color = secondary_teal, linewidth = 1.5) +
  geom_vline(xintercept = 0, linetype = "dashed", color = slate_gray, linewidth = 1) +
  annotate("text", x = -0.7, y = 1, label = "Y\u2070 (untreated)",
           color = accent_coral, fontface = "bold", size = 4.5) +
  annotate("text", x = 0.7, y = 5.5, label = "Y\u00b9 (treated)",
           color = secondary_teal, fontface = "bold", size = 4.5) +
  annotate("text", x = 0, y = 6.8, label = "Cutoff",
           color = slate_gray, size = 3.5) +
  labs(title = "Potential Outcomes",
       subtitle = "Both are smooth everywhere",
       x = "Running Variable", y = "Potential Outcomes") +
  theme_health_econ() +
  coord_cartesian(ylim = c(0, 7))

# Right panel: Realized outcomes (only observed arm, with LATE annotation)
p_right <- ggplot(sv_data, aes(x = x, y = y_obs, color = factor(treated))) +
  geom_point(alpha = 0.4, size = 1.5) +
  geom_smooth(method = "lm", se = FALSE, linewidth = 1.5) +
  geom_vline(xintercept = 0, linetype = "dashed", color = slate_gray, linewidth = 1) +
  scale_color_manual(values = c("0" = accent_coral, "1" = secondary_teal),
                     guide = "none") +
  annotate("segment", x = 0, xend = 0, y = 2, yend = 4.5,
           arrow = arrow(ends = "both", length = unit(0.1, "inches")),
           color = primary_blue, linewidth = 2) +
  annotate("label", x = 0.18, y = 3.25, label = "LATE",
           color = "white", fill = primary_blue, fontface = "bold",
           size = 5, label.size = 0) +
  annotate("text", x = 0, y = 6.8, label = "Cutoff",
           color = slate_gray, size = 3.5) +
  labs(title = "Actual Outcomes",
       subtitle = "Treatment switches which outcome we observe",
       x = "Running Variable", y = "Observed Outcome") +
  theme_health_econ() +
  coord_cartesian(ylim = c(0, 7))

p_left + p_right

Figure 6: Left: Both potential outcomes are smooth. Right: Treatment assignment creates the observable jump.

Question: Why is the left panel different from the right? In the right panel, where did the second line go?

RDD Identifies a Local Treatment Effect

The treatment effect at the cutoff:

\[ \delta_{\text{SRD}} = \lim_{x \downarrow c_0} E[Y_i | X_i = x] - \lim_{x \uparrow c_0} E[Y_i | X_i = x] \]

This is a Local Effect

The RDD treatment effect is identified only at the cutoff. It tells us what happens when people just barely cross the threshold—not what would happen far from it.

Estimating the Treatment Effect with Regression

In practice, we estimate this with regression:

\[ Y_i = \alpha + \tau D_i + \beta_1 (X_i - c_0) + \beta_2 D_i \cdot (X_i - c_0) + \varepsilon_i \]

where \(D_i = \mathbf{1}(X_i \ge c_0)\) and \(\tau\) is the treatment effect.

Graduate Students: Why Center at the Cutoff?

Using \(\tilde{X}_i = X_i - c_0\) ensures the intercept \(\alpha\) gives the conditional mean just below the cutoff, and \(\tau\) gives the jump. Without centering, the intercept has no useful interpretation.

From Theory to Practice

The smoothness assumption tells us why the jump is causal. Now we need to decide how to measure it. This turns out to be harder than it looks — and the answer involves a fundamental trade-off.

From Identification to Estimation

Figure 7

Measuring the Drop

Bandwidths, Fits, and Precision

The Bandwidth Trade-off

Narrow bandwidth:

  • Units closer to cutoff → more comparable
  • Less bias from functional form
  • But: fewer observations → more variance

Wide bandwidth:

  • More observations → more precision
  • But: units farther from cutoff may differ
  • More sensitive to functional form assumptions
Show R Code
# Show different bandwidths
bw_data <- rdd_example %>%
  mutate(
    in_narrow = abs(x) <= 0.3,
    in_wide = abs(x) <= 0.7
  )

ggplot(bw_data, aes(x = x, y = y)) +
  # Wide bandwidth region
  annotate("rect", xmin = -0.7, xmax = 0.7, ymin = -Inf, ymax = Inf,
           fill = primary_blue, alpha = 0.1) +
  # Narrow bandwidth region
  annotate("rect", xmin = -0.3, xmax = 0.3, ymin = -Inf, ymax = Inf,
           fill = secondary_teal, alpha = 0.2) +
  geom_point(aes(color = factor(treated)), alpha = 0.5, size = 2) +
  geom_vline(xintercept = 0, linetype = "dashed", color = slate_gray) +
  scale_color_manual(values = c("0" = accent_coral, "1" = secondary_teal), guide = "none") +
  annotate("text", x = 0, y = 6.5, label = "Narrow BW",
           color = secondary_teal, fontface = "bold", size = 4) +
  annotate("text", x = 0.5, y = 6.5, label = "Wide BW",
           color = primary_blue, fontface = "bold", size = 4) +
  labs(
    title = "Bandwidth Selection",
    subtitle = "The eternal bias-variance trade-off",
    x = "Running Variable", y = "Outcome"
  ) +
  theme_health_econ() +
  coord_cartesian(ylim = c(0, 7))
Figure 8

Modern Bandwidth Selection Methods

Table 1: Common approaches to bandwidth selection in RDD
Method Approach Key Feature
IK (Imbens-Kalyanaraman) Minimizes MSE of treatment effect Widely used baseline
CCT (Calonico-Cattaneo-Titiunik) MSE-optimal + bias correction Robust inference (recommended)
Cross-Validation Leave-one-out prediction error Data-driven, no formula
Rule of Thumb Based on sample size and variance Simple but crude

Graduate Students: The CCT Approach

Calonico, Cattaneo, and Titiunik (2014) showed that conventional RDD confidence intervals are too narrow. Their approach uses bias correction and robust standard errors. The rdrobust package in R implements this.

Graduate Students: MSE-Optimal Bandwidth

The IK and CCT bandwidths minimize the mean squared error: \(\text{MSE}(h) = \text{Bias}^2(h) + \text{Variance}(h)\). As \(h \to 0\), bias vanishes but variance explodes. The optimal \(h\) balances these two forces.

Flexible Regression Specifications

The saturated model allows different slopes on each side of the cutoff:

\[ Y_i = \alpha + \tau D_i + \beta_1 (X_i - c_0) + \beta_2 D_i \cdot (X_i - c_0) + \varepsilon_i \]

For more flexibility, add polynomial terms (where \(\tilde{X}_i = X_i - c_0\)):

\[ \begin{aligned} Y_i = \alpha + \tau D_i &+ \beta_1 \tilde{X}_i + \beta_2 \tilde{X}_i^2 \\ &+ \gamma_1 D_i \cdot \tilde{X}_i + \gamma_2 D_i \cdot \tilde{X}_i^2 + \varepsilon_i \end{aligned} \]

Best Practice

Modern practice favors local linear regression over high-order global polynomials. High-order polynomials can create spurious jumps and are sensitive to data at the boundaries.

From Estimation to Skepticism

We now have the tools to estimate the jump. But a good researcher asks: What could go wrong? The most dangerous threat is the one that undermines the entire design.

From Estimation to Threats

Figure 9

When the Edge Crumbles

Manipulation, Heaping, and Falsification

Manipulation Destroys the Design

If people can manipulate the running variable…

  • Units sort themselves above/below the cutoff
  • Those just above differ systematically from those just below
  • The “as-if-random” assumption fails

Examples of manipulation:

  • Students retaking exams to cross a scholarship threshold
  • Hospitals recoding diagnoses to avoid quality penalties
  • Birth weight recorded at exactly 1500g to qualify for NICU
Show R Code
# Simulated manipulation - bunching above cutoff
set.seed(123)
manip_data <- tibble(
  x = c(rnorm(300, -0.3, 0.25), rnorm(200, 0.1, 0.15))
)

ggplot(manip_data, aes(x = x)) +
  geom_histogram(binwidth = 0.05, fill = primary_blue, color = "white", alpha = 0.8) +
  geom_vline(xintercept = 0, linetype = "dashed", color = accent_coral, linewidth = 1.5) +
  annotate("text", x = 0.25, y = 50, label = "Suspicious\nbunching!",
           color = accent_coral, fontface = "bold", size = 5) +
  labs(
    title = "Evidence of Manipulation",
    subtitle = "Excess mass just above the cutoff",
    x = "Running Variable", y = "Count"
  ) +
  theme_health_econ()
Figure 10

Detecting Manipulation: The McCrary Test

Idea: If there’s no manipulation, the density of the running variable should be continuous at the cutoff.

Show R Code
# Valid density
valid_density <- tibble(
  x = seq(-2, 2, length.out = 200),
  density = dnorm(x, 0, 0.8)
)

p_valid <- ggplot(valid_density, aes(x = x, y = density)) +
  geom_area(fill = secondary_teal, alpha = 0.3) +
  geom_line(color = secondary_teal, linewidth = 1.5) +
  geom_vline(xintercept = 0, linetype = "dashed", color = slate_gray) +
  labs(title = "No Manipulation", subtitle = "Density is smooth at cutoff",
       x = "Running Variable", y = "Density") +
  theme_health_econ() +
  annotate("text", x = 0, y = 0.55, label = "Continuous",
           color = secondary_teal, fontface = "bold", size = 5)

# Manipulated density
manip_density <- tibble(
  x = seq(-2, 2, length.out = 200),
  density = ifelse(x < 0, dnorm(x, 0, 0.8) * 0.7, dnorm(x, 0, 0.8) * 1.3)
)

p_manip <- ggplot(manip_density, aes(x = x, y = density)) +
  geom_area(fill = accent_coral, alpha = 0.3) +
  geom_line(color = accent_coral, linewidth = 1.5) +
  geom_vline(xintercept = 0, linetype = "dashed", color = slate_gray) +
  labs(title = "Manipulation Detected", subtitle = "Jump in density at cutoff",
       x = "Running Variable", y = "Density") +
  theme_health_econ() +
  annotate("segment", x = 0, xend = 0, y = 0.35, yend = 0.45,
           arrow = arrow(ends = "both", length = unit(0.08, "inches")),
           color = primary_blue, linewidth = 1.5) +
  annotate("text", x = 0.3, y = 0.4, label = "Gap!",
           color = accent_coral, fontface = "bold", size = 5)

p_valid + p_manip

Figure 11: Left: Valid RDD with continuous density. Right: Manipulation creates a discontinuity in the density.

Graduate Students: McCrary Test Mechanics

McCrary (2008) estimates the density of \(X\) using local polynomial regression on a finely-binned histogram, separately on each side of \(c_0\). The test statistic is \(\hat{f}^+(c_0) - \hat{f}^-(c_0)\) divided by its standard error.

Density Test: Visual Evidence from McCrary (2008)

A picture with and without a discontinuity in the density from McCrary, 2008.

NICU Saves Lives — But Can RDD Prove It?

  • What is the causal effect of heightened medical care for premature babies on infant mortality?
  • Babies in NICU have lower mortality even if they weren’t in the NICU

RDD Solution: Use 1500 grams as the cutoff with large administrative hospital data

Heaping: Visual Evidence

Illustration of heaping on the running variable (reproduced from Almond, 2010)

Why So Many Babies Weigh Exactly 1500 Grams?

Think about it: birth weight is a continuous biological measurement. You would not expect a spike at any particular number.

So why do we see piles of babies recorded at round numbers?

Two possible stories:

  1. The Scale Story: Some hospital scales are less precise, so nurses round to the nearest 100g. This is measurement error — and if rounding is random, it might be okay.

  2. The Strategic Story: Staff know that babies under 1500g get extra NICU resources. A baby weighing 1510g might get “rounded down” to qualify. This is manipulation — and it breaks RDD.

Why This Matters for Pre-Med Students

As future clinicians, you will generate the data that researchers use. How you record a measurement — whether you round, truncate, or estimate — has direct consequences for whether causal studies produce valid results.

When Rounding Breaks the Design

  • Standard density tests (McCrary) can miss heaping because they lack statistical power right at the cutoff
  • The heaped babies at exactly 1500g tend to be sicker than babies on either side — they are outliers that distort the estimated treatment effect
  • The bottom line: Always plot the raw data. Statistical tests are necessary but not sufficient. Your eyes can catch patterns that formal tests miss.

Robustness Check: Donut Hole Regression

Drop observations near the cutoff to test if results are robust to heaping. Note: this changes the parameter being estimated.

Robustness Checks for RDD

Table 2: Essential validity checks for RDD studies
Check Purpose Red Flag If...
McCrary density test Detect manipulation of running variable Significant discontinuity in density
Covariate balance Verify smoothness of observable confounders Covariates jump at cutoff
Bandwidth sensitivity Results stable across bandwidth choices? Effect changes drastically with bandwidth
Placebo cutoffs No effect at fake cutoffs? Significant effects at placebo cutoffs
Donut hole regression Results robust to dropping observations near cutoff? Effect disappears when dropping heaped values

RDD Effects Are Local — Generalize with Caution

RDD Estimates Are Local

The treatment effect is identified only at the cutoff. Extrapolating to other values of the running variable requires strong assumptions.

When it might generalize:

  • Effects roughly constant across the running variable
  • Cutoff is at a policy-relevant point
  • Compliers are representative of the broader population

When it probably won’t:

  • Strong heterogeneity in treatment effects
  • Cutoff is at an extreme value
  • Selection into treatment differs away from cutoff

From Methods to Medicine

We’ve built the toolkit — now let’s use it. The next application puts every concept we’ve learned to work on a real health policy question.

From Threats to Application

Figure 12

From Methods to Practice

Applying RDD to Real Policy Questions

Sojourner et al. (2015)

Sojourner et al. (2015): Discussion Questions

Skim the paper and think about the following:

  1. What is the main objective or research question?
  2. What is the running variable and why is it a good choice?
  3. What methods are they using?
  4. What are the primary threats to validity and how do they address them?
  5. What are the main results and policy implications?

Context

Union certification in the US requires a majority vote. This creates a natural 50% threshold.

The Research Question

What is the causal effect of unionization on nursing home performance?

The challenge:

  • Unionized workplaces may differ systematically
  • Workers who vote to unionize may be different
  • Simple comparisons are confounded

The RDD solution:

  • Compare nursing homes where the union barely won vs. barely lost
  • At the 50% vote threshold, outcome is essentially random
Show R Code
# Conceptual union RDD
union_concept <- tibble(
  vote_share = seq(0.3, 0.7, length.out = 100),
  certified = ifelse(vote_share >= 0.5, 1, 0)
)

ggplot(union_concept, aes(x = vote_share, y = certified)) +
  geom_step(color = secondary_teal, linewidth = 2) +
  geom_vline(xintercept = 0.5, linetype = "dashed", color = accent_coral, linewidth = 1.5) +
  scale_x_continuous(labels = scales::percent) +
  scale_y_continuous(breaks = c(0, 1), labels = c("No", "Yes")) +
  labs(
    title = "Sharp RDD at 50%",
    subtitle = "Union certification determined by majority vote",
    x = "Vote Share for Union",
    y = "Union Certified"
  ) +
  theme_health_econ() +
  annotate("text", x = 0.35, y = 0.5, label = "Barely lost",
           color = accent_coral, fontface = "bold", size = 4) +
  annotate("text", x = 0.65, y = 0.5, label = "Barely won",
           color = secondary_teal, fontface = "bold", size = 4)
Figure 13

The First Stage: Union Certification

The probability of union certification jumps discretely at the 50% vote threshold

Note

This is essentially a sharp RDD: crossing the 50% threshold determines certification with near certainty.

Results: Staffing Levels

Key Findings:

  • Nurse aide hours: -0.31 hrs/resident-day (SE = 0.12), a ~15% reduction
  • RN hours: -0.21 hrs/resident-day (SE = 0.13), a ~40% reduction
  • LPN hours: small negative, not statistically significant

Interpretation:

CPS wage data shows corresponding union premia (NA: +14.8%, RN: +8.5%) — unions raise labor costs, firms reduce staffing

Effect of union certification on staffing levels per resident-day

Results: Quality of Care

Key Findings:

  • Total deficiency counts: point estimate near zero, statistically insignificant
  • Severe deficiency indicator: no significant change
  • Private-pay resident share (market proxy for quality): stable

Bottom line:

Despite 15-40% staffing reductions, quality of care is preserved — suggesting labor productivity increased

Effect of union certification on quality indicators

Validity Checks & Policy Implications

Robustness Checks Passed:

  • McCrary density test: no evidence of vote manipulation
  • Covariate balance: observables smooth across cutoff
  • Bandwidth sensitivity: results stable

Policy Implications:

  • Unionization improves staffing efficiency without harming quality
  • Productivity gains may offset higher labor costs

Caveats

  • Effects are local to nursing homes with close elections
  • May not generalize to all nursing homes
  • Union certification ≠ full unionization (some never fully organize)

From Application to Synthesis

Figure 14

Key Takeaways

What to Remember for Applied Work

Summary: RDD in Health Economics

Table 3: Key concepts from today’s lecture
Concept Description
What RDD Does Exploits forced treatment assignment at a threshold to estimate causal effects
Sharp vs. Fuzzy Sharp: treatment deterministic at cutoff. Fuzzy: probability jumps but isn't 0/1
Key Assumption Potential outcomes are smooth/continuous at the cutoff
Main Threats Manipulation of running variable, heaping, selection on unobservables
External Validity Effects are local to the cutoff—may not generalize

RDD Checklist for Applied Work

Before Using RDD, Verify:

  1. Clear cutoff: Is there a well-defined threshold?
  2. Treatment assignment: Does crossing the cutoff change treatment?
  3. No manipulation: Can units sort themselves around the cutoff?
  4. Sufficient data: Enough observations near the cutoff?
  5. Smooth relationship: Is the outcome-running variable relationship well-behaved?

Best Practices for RDD

  • Always plot the data
  • Run McCrary density tests
  • Check covariate balance at the cutoff
  • Test sensitivity to bandwidth and functional form
  • Be humble about external validity

Common RDD Settings in Health Economics

Setting Running Variable Cutoff Application
Medicare eligibility Age 65 years Insurance effects on health
Medicaid Income FPL threshold Access to care
NICU admission Birth weight 1500g Intensive care effects
Hospital quality ratings Score Star cutoffs Consumer/provider behavior
Organ transplant MELD score Allocation threshold Transplant effects

The Cliff Edge Principle

What to Remember

“Nature is smooth. When you see a sharp edge in the data, someone built it — and at that edge, you can find causation.”

RDD works because:

  • Policy creates discontinuities that nature would not
  • People near the edge are comparable — they just happened to land on different sides
  • The jump at the edge is causal — as long as nothing else jumps there too

This is the most visually transparent causal method we have. If you can see the jump, you can believe the effect.

Looking Ahead

Coming Up: Before and After

RDD exploits a spatial edge — comparing units on either side of a cutoff. But what if the treatment arrives at a specific moment in time?

  • Difference-in-Differences: Using the passage of time as a natural experiment
  • Parallel Trends: The assumption that makes “before and after” comparisons credible
  • Event Studies: Visualizing treatment effects dynamically

From the cliff edge to the time machine — our causal toolkit keeps growing.

Discussion Questions

  1. Smoothness violation: At age 65, both Medicare eligibility and Social Security retirement benefits kick in. How might this complicate an RDD using age 65 as the cutoff?

  2. Manipulation: A state uses a standardized test score cutoff to assign students to remedial courses. Students who score just below the cutoff know they can retake the test next week. Does the Twins Test pass? Why or why not?

  3. Sharp or fuzzy? A hospital uses BMI > 40 as the criterion for recommending bariatric surgery. Not all patients above 40 get the surgery, and a few below 40 do. What type of RDD is this? What is the “first stage”?

  4. External validity: The RDD tells us the effect of NICU admission for babies weighing right around 1500g. Should we use this estimate to make policy for all premature babies? Why might the effect differ for a baby weighing 1200g?

From Takeaways to Appendix

Figure 15

Appendix: Technical Details

Local Linear Regression Implementation

Show R Code
library(rdrobust)

# Basic RDD estimation with rdrobust
rdd_result <- rdrobust(
  y = data$outcome,           # Outcome variable
  x = data$running_var,       # Running variable
  c = 0,                      # Cutoff value
  kernel = "triangular",      # Kernel function
  bwselect = "mserd"          # CCT bandwidth selection
)

# Summary of results
summary(rdd_result)

# Plot the RDD
rdplot(
  y = data$outcome,
  x = data$running_var,
  c = 0,
  title = "RDD Plot",
  x.label = "Running Variable",
  y.label = "Outcome"
)

The Fuzzy RDD Estimator

For fuzzy RDD, we use a Wald estimator (similar to IV):

\[ \tau_{\text{Fuzzy}} = \frac{\lim_{x \downarrow c} E[Y|X=x] - \lim_{x \uparrow c} E[Y|X=x]}{\lim_{x \downarrow c} E[D|X=x] - \lim_{x \uparrow c} E[D|X=x]} \]

Interpretation: This is the LATE for compliers at the cutoff—units whose treatment status is changed by crossing the threshold.

In R:

Show R Code
# Fuzzy RDD with rdrobust
fuzzy_result <- rdrobust(
  y = data$outcome,
  x = data$running_var,
  fuzzy = data$treatment,  # Actual treatment (not deterministic)
  c = 0
)

Polynomial Order Selection

Global polynomial: \[ Y_i = \sum_{p=0}^{P} \beta_p X_i^p + \sum_{p=0}^{P} \gamma_p D_i \cdot X_i^p + \varepsilon_i \]

Local linear (preferred): \[ Y_i = \alpha + \tau D_i + \beta (X_i - c) + \gamma D_i(X_i - c) + \varepsilon_i \]

Why Local Linear is Preferred

Gelman and Imbens (2019) show that high-order global polynomials:

  • Give noisy estimates
  • Have poor coverage of confidence intervals
  • Can create spurious discontinuities

Local linear regression with optimal bandwidth is more robust.

Data Requirements for RDD

  • Large sample sizes are typically required — RDD uses only data near the cutoff, discarding much of the sample
  • Strong trends in the running variable demand more data to distinguish the jump from the slope
  • Low density near the cutoff requires sufficient observations just above and below the threshold
  • RDD favors administrative or firm-level data from government agencies, firms, or platforms
  • Advances in computing and data availability have fueled RDD’s rise since the 2000s

RDD in the Causal Inference Toolkit

Design Key Assumption What It Restricts
RDD Smoothness of potential outcomes at cutoff \(E[Y^0 \| X]\) and \(E[Y^1 \| X]\) continuous at \(c_0\)
IV Independence + Exclusion Instrument affects outcome only through treatment
DiD Parallel trends \(E[Y^0 \| D=1]\) and \(E[Y^0 \| D=0]\) share same time trend

Each design restricts potential outcomes differently. RDD restricts spatial smoothness; DiD restricts temporal patterns; IV restricts the instrument’s pathways.

Bias-Variance Decomposition in RDD

The MSE-optimal bandwidth minimizes:

\[\text{MSE}(h) = \underbrace{\text{Bias}^2(h)}_{\text{grows with } h} + \underbrace{\text{Variance}(h)}_{\text{shrinks with } h}\]

Kernel choice determines how observations within the bandwidth are weighted:

Kernel Weighting Default in
Uniform Equal weight to all obs in bandwidth Simple implementations
Triangular Linear downweight away from cutoff rdrobust (recommended)
Epanechnikov Quadratic downweight Some older implementations

Triangular kernel is preferred because it gives MSE-optimal properties at boundary points.

RDD with Covariates

Adding covariates to RDD improves precision but is not needed for identification (unlike IV or OLS):

\[Y_i = \alpha + \tau D_i + \beta_1 (X_i - c_0) + \beta_2 D_i \cdot (X_i - c_0) + \gamma' W_i + \varepsilon_i\]

Why add covariates?

  • Reduces residual variance \(\Rightarrow\) tighter confidence intervals
  • Point estimates should remain stable (if RDD is valid)
  • Stability across conditioning sets is itself a validity check

Sojourner et al. (2012) example: Their Spec A conditions only on vote share. Spec B adds pre-election staffing. Spec C adds the full vector of pre-election characteristics. Point estimates are remarkably stable across all three — evidence supporting the design’s credibility.

References

  • Angrist, J. D., & Pischke, J.-S. (2009). Mostly Harmless Econometrics. Princeton University Press.
  • Calonico, S., Cattaneo, M. D., & Titiunik, R. (2014). Robust nonparametric confidence intervals for regression-discontinuity designs. Econometrica, 82(6), 2295-2326.
  • Cunningham, S. (2021). Causal Inference: The Mixtape. Yale University Press.
  • Gelman, A., & Imbens, G. (2019). Why high-order polynomials should not be used in regression discontinuity designs. Journal of Business & Economic Statistics, 37(3), 447-456.
  • Imbens, G., & Kalyanaraman, K. (2012). Optimal bandwidth choice for the regression discontinuity estimator. Review of Economic Studies, 79(3), 933-959.
  • Lee, D. S., & Lemieux, T. (2010). Regression discontinuity designs in economics. Journal of Economic Literature, 48(2), 281-355.
  • McCrary, J. (2008). Manipulation of the running variable in the regression discontinuity design. Journal of Econometrics, 142(2), 698-714.
  • Sojourner, A. J., Frandsen, B. R., Town, R. J., Grabowski, D. C., & Chen, M. M. (2015). Impacts of unionization on quality and productivity. ILR Review, 68(4), 771-806.