Panel Data Simulation Lab

Interactive exploration of cross-sectional vs panel data estimators. Understand when OLS fails and why Fixed Effects and Random Effects matter.

🎯 Learning Objective

Explore how different estimators recover the true causal effect (β) under various data-generating processes. Adjust parameters to see how selection bias, measurement error, and time trends affect each estimator's performance.

Actions

Generate a new random sample with current parameters, or download the simulated data for external analysis.

Scenario Presets

Click a preset to load parameter values demonstrating specific phenomena.

Data Generation Parameters

Adjust parameters to control the simulated panel data. Click headers to expand/collapse sections.

Sample Structure
More individuals → more precise estimates
More periods → better within-person variation for FE
True Causal Effect
The "ground truth" we're trying to estimate. Compare β̂ to this.
Selection into Treatment
Overall treatment prevalence when there's no selection.
Key! ρ > 0 → selection bias in OLS; FE corrects this.
Higher → more within-person variation → more FE power.
Variance Components
Unobserved individual factors. FE removes these; RE assumes uncorrelated with D.
Random shocks varying across time (transitory variation).
Classical measurement error in Y. Adds noise, no bias.
Heterogeneity & Groups
σβ > 0 means each person has their own effect around the mean.
Split sample into G groups for subgroup analysis.
Systematic heterogeneity: groups have different average effects.
Time Dynamics
Common time trend. Can bias estimates if correlated with D timing.

Visualization & Estimation

Visualization Options
View any wave as a cross-section.
More lines show more heterogeneity but can clutter.
Estimation Options
Clustered SEs appropriate for panel data.
📊 Data Overview 📈 Estimator Comparison 🔍 Model Diagnostics 📚 Methodology Guide

Cross-Section: Selected Wave

Observations OLS fit

Panel Trajectories Over Time

Individual paths Mean Median IQR

Mean Outcome by Wave & Treatment

D=0 mean D=1 mean

True Effect Heterogeneity (βi)

Mean β Individuals

What You're Seeing

Left: Cross-sectional snapshot using the selected wave. This is what you'd have without panel data. The slope shows the raw association between treatment and outcome.

Right: Individual trajectories over time. Faint lines show 20 random individuals; bold lines show aggregate statistics. Look for: (1) individual heterogeneity in levels, (2) within-person variation, (3) time trends.

Current Simulation Summary