Panel Data Multiverse Lab

A real longitudinal wage panel meets the multiverse. Pick an effect, then watch the whole universe of estimates open up as you change estimator, controls, sample and inference — and learn why the answer moves. Data: NLSY · 545 men · 1980–1987. Every number is verified against R's plm.

🎯 The big idea

One dataset, one question, many defensible analyses. The spread of results across reasonable analytic choices is the “garden of forking paths.” Here you build that garden by hand on real wage data and see which conclusions are robust and which are an artifact of a single specification.

Loading real panel data…

Research question

Focal effect (the estimand)

Union wage premium Marriage wage premium Returns to schooling

Outcome

log wage → coefficient ≈ % change. Level → change in dollars.

Standard errors (inference)

Clustering by person is the defensible default for panels: it allows arbitrary within-person correlation. It changes the confidence interval, not the point estimate.

Reproducibility seed

Seeds the cluster bootstrap and the randomization test. The same seed reproduces the same confidence intervals and p-values; the seed is encoded in the “Copy shareable link” URL.

The panel

What to look for

The union and marriage “premiums” look large in Pooled OLS, but shrink sharply once Fixed Effects removes stable person traits — evidence of selection: higher-wage men sort into unions and marriage. Meanwhile the return to schooling cannot be estimated by FE at all, because education is fixed within a person over these years. Open the Multiverse tab to see the full distribution.

📊 Data Overview 🌌 Specification Curve 🔬 Single-Spec Deep-Dive 💾 Export & Code 📚 Methodology

🎯 Objective: meet the data as people, and see where the variation lives. A regressor only has a within-person effect that Fixed Effects can estimate if it actually changes for the same person over time — count the switchers below.

Descriptive statistics

Who identifies the effect? (switchers)

Wage trajectories (real people)

40 random workers overall mean IQR

Distribution of log wage

all person-years

Mean outcome by focal status & year

focal = 1 focal = 0

Where is the variation?

Reading this tab

Trajectories: each faint line is one man's wage path over up to 8 years. Panel data lets us watch the same person change — the engine of Fixed Effects.

Where is the variation? For a regressor to be identified by Fixed Effects, it must vary within people over time. A bar that is almost all “between” (like schooling) cannot be estimated by FE.

🎯 Objective: open up one point on the curve. Compare all estimators side by side, read the Hausman test, watch the live equation change with each choice, and check the residual diagnostics — then defend (or abandon) the specification.

Estimator

Controls

Year fixed effects

Sample

Bootstrap

Block-resamples persons to get a CI for the focal coefficient of the selected estimator, free of distributional assumptions.

All five estimators, side by side

estimate (95% CI) grey = not identified

Hausman test: FE vs RE

⚙️ Under the hood — the equation being estimated

show equations

Regression output —

Residual diagnostics

Do the model's residuals behave? Look for no pattern or funnel shape (left), an approximately symmetric bell (centre), and points on the line (right).

Residuals vs fitted. A flat, patternless band supports linearity and constant variance; curvature or a funnel signals misspecification or heteroskedasticity.

Residual distribution. Should be roughly symmetric and bell-shaped; the dashed curve is a fitted normal density for comparison.

Normal Q–Q. Points on the reference line indicate normally distributed residuals; departures in the tails indicate skew or heavy tails.

The data: the NLSY wage panel

545 young men from the U.S. National Longitudinal Survey of Youth, each observed every year from 1980 to 1987 — a balanced panel of 4,360 person-years. The extract comes from Vella & Verbeek (1998) and is distributed in the wooldridge and plm R packages; this lab bundles a tidy copy.

Variable	Meaning	Varies within person?
lwage	log hourly wage (the outcome)	yes
union	covered by a union contract	yes (men join/leave)
married	currently married	yes
educ	years of schooling	no — fixed 1980–87
exper, expersq	labour-market experience & its square	yes
hours, poorhlth, industry, region	time-varying controls	yes
black, hisp	race / ethnicity	no

The estimators

Each estimator sits at a different point on an identification spectrum, trading bias against what it can estimate at all.

Estimator	What it uses	Identifying assumption	Removes time-invariant confounding?
Pooled OLS	all variation (within + between)	E[uᵢ \| xᵢₜ] = 0	No
Between	person means only (cross-person)	E[uᵢ \| x̄ᵢ] = 0	No
Random Effects	GLS weighting of within + between	E[uᵢ \| xᵢₜ] = 0 (RE exogeneity)	No
Fixed Effects (within)	within-person deviations	E[εᵢₜ \| xᵢₜ, uᵢ] = 0	Yes
First Differences	year-to-year changes	E[Δεᵢₜ \| Δxᵢₜ] = 0	Yes
Correlated RE (Mundlak)	RE + person-means of regressors	person-mean captures the correlation	Yes (focal coef = FE)
Two-way FE	within-person + year effects	E[εᵢₜ \| xᵢₜ, uᵢ, λₜ] = 0	Yes (+ common shocks)

yᵢₜ = α + β·xᵢₜ + uᵢ + εᵢₜ (uᵢ = stable person trait, unobserved) Fixed Effects subtracts each person's own average: (yᵢₜ − ȳᵢ) = β·(xᵢₜ − x̄ᵢ) + (εᵢₜ − ε̄ᵢ) ← uᵢ cancels. If xᵢₜ never changes for a person (e.g. educ), then xᵢₜ − x̄ᵢ = 0 for every observation, so β is not identified. This is why Fixed Effects and First Differences drop the return to schooling — and why the Mundlak device, which adds the person-mean x̄ᵢ to a random-effects model, recovers exactly the FE coefficient.

Notation. Subscript i indexes a person (unit) and t a year (time), so x_it is “variable x for person i in year t.” y is the outcome, x the regressor of interest, α the intercept (baseline), β the effect being estimated, u_i the time-invariant person effect (stable unobserved traits), ε_it the idiosyncratic error (random year-to-year variation), and λ_t a year effect common to everyone. An overbar (x̄_i) denotes a person's average over their years. The identifying assumptions in the table are written as conditional expectations: E[ε | x] = 0 reads “the error has zero mean given the regressors” — i.e. the regressors are uncorrelated with what is left out.

Why a multiverse?

A single regression hides a decision tree: which estimator, which controls, which sample, which standard errors. Each branch is defensible, yet they can give different answers. A specification curve (Simonsohn, Simmons & Nelson, 2020) or multiverse analysis (Steegen et al., 2016) reports the whole distribution of estimates instead of one cherry-picked number, making analytic flexibility — and its limits — visible.

Teaching use. Ask students to predict the curve before building it: Will the union premium survive Fixed Effects? Does adding industry controls matter? Is the schooling return robust? Then test each belief against the universe.

Inference: from one CI to the whole curve

The lab offers three covariance estimators (classical i.i.d., heteroskedasticity-robust HC1, and cluster-robust by person CR1), an assumption-free cluster bootstrap that block-resamples persons, and a full Hausman test comparing Fixed and Random Effects jointly over their common coefficients. For the multiverse as a whole, a randomization test permutes the focal variable under the sharp null of no effect, recomputes the entire curve many times, and asks how often the null reproduces a median estimate (or count of significant results) as extreme as the one observed.

Verification — honest numerics

Every estimator is implemented from textbook formulas in plain JavaScript and independently reproduced in two languages: R (plm) and Python (linearmodels). Across all 1,440 specifications of the default multiverse, the pooled, between, fixed-effects and first-difference coefficients agree to better than 1×10⁻⁶ (R) and 5×10⁻⁷ (Python); random effects to 3×10⁻⁴. Benchmark focal coefficients (log wage on focal + experience, full sample):

Effect	Pooled	Between	RE	FE	FD
Union premium	0.167	0.257	0.102	0.083	0.043
Marriage premium	0.164	0.220	0.077	0.047	0.038
Return to schooling	0.102	0.099	0.103	—	—

The Hausman test in the Deep-Dive is the full joint test over all common time-varying coefficients (matching R's phtest), computed with classical covariance as the theory requires.

Glossary

Within variation: How much a variable changes for the same person over time. Fixed Effects and First Differences use only this.
Between variation: Differences in person averages across people. The Between estimator uses only this; Fixed Effects discards it.
Time-invariant regressor: A variable that never changes within a person (e.g. years of schooling here). Within estimators cannot identify its effect.
Selection on unobservables: When who "gets treated" (e.g. joins a union) is correlated with stable unobserved traits, biasing estimators that use between variation.
θ (theta): The random-effects quasi-demeaning weight: θ=0 gives Pooled OLS, θ→1 approaches Fixed Effects.
Hausman test: Compares Fixed- and Random-Effects coefficients; a significant difference favours Fixed Effects (RE's exogeneity assumption fails).
Specification curve: The sorted set of estimates across all defensible analytic choices, plotted with the choices that produced each one.
Researcher degrees of freedom: The many defensible choices in an analysis whose combination can move the headline result.

How to cite

If you use this lab in teaching or research, please cite it and the underlying data:

Schoenholzer, K. (2026). Panel Data Multiverse Lab: a reproducible specification-curve
approach to panel-data estimation [software]. https://kevinschoenholzer.com/sim-paneldata/

Vella, F. & Verbeek, M. (1998). Whose wages do unions raise? Journal of Applied
Econometrics, 13(2), 163–183.   [the bundled NLSY wage-panel extract]

The estimators are verified against R's plm and Python's linearmodels; the data ships under the terms of the public-use NLSY extract distributed in the wooldridge and plm packages.

References

Vella, F. & Verbeek, M. (1998). Whose wages do unions raise? Journal of Applied Econometrics, 13(2), 163–183. (source of the NLSY extract)
Mundlak, Y. (1978). On the pooling of time series and cross section data. Econometrica, 46(1), 69–85.
Hausman, J. A. (1978). Specification tests in econometrics. Econometrica, 46(6), 1251–1271.
Wooldridge, J. M. (2010). Econometric Analysis of Cross Section and Panel Data (2nd ed.). MIT Press.
Croissant, Y. & Millo, G. (2008). Panel data econometrics in R: the plm package. Journal of Statistical Software, 27(2).
Simonsohn, U., Simmons, J. P. & Nelson, L. D. (2020). Specification curve analysis. Nature Human Behaviour, 4, 1208–1214.
Steegen, S., Tuerlinckx, F., Gelman, A. & Vanpaemel, W. (2016). Increasing transparency through a multiverse analysis. Perspectives on Psychological Science, 11(5), 702–712.

Panel Data Multiverse Lab

🎯 The big idea

Research question

The panel

Descriptive statistics

Who identifies the effect? (switchers)

Wage trajectories (real people)

Distribution of log wage

Mean outcome by focal status & year

Where is the variation?

Reading this tab

Which choices vary across the universe?

More forks

Specification curve

Estimator

All five estimators, side by side

Hausman test: FE vs RE

⚙️ Under the hood — the equation being estimated

Regression output —

Residual diagnostics

Reproduce this specification in code

R · plm

Stata

Python · linearmodels

Download

Multiverse results

Figures

Lab report