EHRSHOT
A benchmark for few-shot evaluation of EHR foundation models
15 clinical prediction tasks · longitudinal data · reproducible leaderboard
Leaderboard
Loading…
6,739
Patients
Longitudinal, non-ICU restricted
41.7M
Events
Structured EHR signals
921k
Visits
Real-world care trajectories
15
Tasks
Curated prediction suite
What is EHRSHOT?
New Longitudinal EHR Dataset
Our dataset contains de-identified structured data from the electronic health records of 6,729 patients from Stanford Medicine. Unlike MIMIC-III/IV, EHRSHOT is longitudinal and not restricted to ICU/ED.
Curated Tasks for Benchmarking
Evaluate machine learning models using 15 clinical tasks covering diagnostics, patient outcomes, and resource allocation. Tasks are few-shot focused, requiring only a few labeled examples.
Reproducible Leaderboard & Baselines
Compare your model to strong baselines and other submissions using our leaderboard to see how they perform across 15 prediction tasks and updated easily with reproducible evaluation scripts.