Comprehensive guide to the PISA data underlying the Educational Stratification in PISA tool
This application is designed to study the intergenerational transmission of educational achievement—specifically, how parental characteristics (education, occupational status, and household wealth, captured through the ESCS index) relate to children's academic performance at age 15 in mathematics, reading, and science across more than 100 countries worldwide.
The Programme for International Student Assessment (PISA) is a triennial international survey coordinated by the Organisation for Economic Co-operation and Development (OECD) since 2000. PISA assesses the extent to which 15-year-old students near the end of compulsory education have acquired the knowledge and skills essential for full participation in modern societies.
PISA has been conducted in the following years, with each cycle focusing on one major domain while assessing all three:
| Year | Major Domain | Countries/Economies | Students Assessed | Status |
|---|---|---|---|---|
| 2000 | Reading | 43 | ~265,000 | Available in App |
| 2003 | Mathematics | 41 | ~276,000 | Available in App |
| 2006 | Science | 57 | ~398,000 | Available in App |
| 2009 | Reading | 65 | ~475,000 | Available in App |
| 2012 | Mathematics | 65 | ~510,000 | Available in App |
| 2015 | Science | 72 | ~540,000 | Available in App |
| 2018 | Reading | 79 | ~600,000 | Available in App |
| 2022 | Mathematics | 81 | ~690,000 | Available in App |
This application includes data from all PISA cycles from 2000-2022 (8 assessment cycles: 2000, 2003, 2006, 2009, 2012, 2015, 2018, 2022), covering 513 country-year combinations across 101+ unique countries/economies.
PISA mathematics assesses students' capacity to formulate, employ, and interpret mathematics in a variety of contexts. It includes reasoning mathematically and using mathematical concepts, procedures, facts, and tools to describe, explain, and predict phenomena.
PISA reading literacy assesses students' capacity to understand, use, evaluate, reflect on, and engage with texts in order to achieve goals, develop knowledge and potential, and participate in society.
PISA science literacy assesses the ability to engage with science-related issues and with the ideas of science, as a reflective citizen. It includes understanding natural phenomena, designing scientific enquiry, and interpreting evidence.
PISA targets students who are between 15 years 3 months and 16 years 2 months at the time of assessment, regardless of their grade level. This age range was chosen because students at this age are approaching the end of compulsory schooling in most OECD countries.
PISA employs a sophisticated two-stage stratified sampling design:
PISA provides several types of sampling weights to ensure representative estimates:
This application uses data processed through the learningtower R package (Vaughan et al., 2021), which provides harmonized, analysis-ready PISA data in a consistent format.
The learningtower package is available on CRAN:
install.packages("learningtower")
library(learningtower)
The package provides easy access to PISA data:
# Load all student data for a specific year
data_2018 <- load_student(2018)
# Load data for specific countries
data_usa <- load_student(2018, countries = "USA")
# Access codebook
codebook <- load_codebook()
If you use data from this application, please cite the learningtower package:
Vaughan, B., Stanke, L., Teng, T., Hyndman, R., & O'Hara-Wild, E. (2021). learningtower: OECD PISA datasets from 2000-2018 in an easy-to-use format (R package version 1.0.1). https://CRAN.R-project.org/package=learningtower
This application pre-generates 513 country-year specific JSON files (e.g., USA_2018.json) for efficient progressive loading. Each chunk contains:
{
"country": "USA",
"year": 2018,
"n_students": 4838,
"data_quality": {
"missing_math": 0,
"missing_reading": 0,
"missing_science": 0,
"missing_escs": 0,
"complete_cases": 4838
},
"students": [
{
"student_id": "USA_2018_00001",
"math": 498.5,
"reading": 505.2,
"science": 502.9,
"escs": 0.23,
"gender": "male",
"age": 15.5,
...
}
]
}
The application also includes a metadata.json file that catalogs all available data:
{
"countries": ["ALB", "ARG", "AUS", ...],
"years": [2000, 2003, 2006, 2009, 2012, 2015, 2018, 2022],
"variables": {
"math": "Mathematics achievement score",
"reading": "Reading achievement score",
"science": "Science achievement score",
"escs": "PISA index of economic, social and cultural status",
...
}
}
Note: PISA uses plausible values to account for measurement error. This application uses the first plausible value (PV1) for each domain for simplicity. Advanced analyses should consider all 10 plausible values.
The official source for all PISA data is the OECD PISA Data Portal:
Main Portal: https://www.oecd.org/pisa/data/
PISA data contain missing values due to:
This application uses complete-case analysis by default. Advanced users should consider multiple imputation methods for handling missing data.
The following 38 OECD countries are available (availability varies by year):
Australia (AUS)
Austria (AUT)
Belgium (BEL)
Canada (CAN)
Chile (CHL)
Colombia (COL)
Costa Rica (CRI)
Czech Republic (CZE)
Denmark (DNK)
Estonia (EST)
Finland (FIN)
France (FRA)
Germany (DEU)
Greece (GRC)
Hungary (HUN)
Iceland (ISL)
Ireland (IRL)
Israel (ISR)
Italy (ITA)
Japan (JPN)
Korea (KOR)
Latvia (LVA)
Lithuania (LTU)
Luxembourg (LUX)
Mexico (MEX)
Netherlands (NLD)
New Zealand (NZL)
Norway (NOR)
Poland (POL)
Portugal (PRT)
Slovak Republic (SVK)
Slovenia (SVN)
Spain (ESP)
Sweden (SWE)
Switzerland (CHE)
Turkey (TUR)
United Kingdom (GBR)
United States (USA)
An additional 60+ partner countries and economies are available, including:
Albania (ALB)
Argentina (ARG)
Brazil (BRA)
Bulgaria (BGR)
China (CHN)*
Croatia (HRV)
Hong Kong (HKG)
India (IND)*
Indonesia (IDN)
Jordan (JOR)
Kazakhstan (KAZ)
Macao (MAC)
Malaysia (MYS)
Peru (PER)
Qatar (QAT)
Romania (ROU)
Russia (RUS)
Serbia (SRB)
Singapore (SGP)
Chinese Taipei (TWN)
Thailand (THA)
Uruguay (URY)
Vietnam (VNM)
* Note: Some countries participate through specific regions or provinces rather than nationally