HealthBench

text
+
+
+
+
About

HealthBench is a comprehensive AI evaluation benchmark specifically designed for healthcare applications, developed by OpenAI with input from 250+ medical experts. This benchmark evaluates AI models in realistic healthcare scenarios, testing medical knowledge, clinical reasoning, and diagnostic capabilities. HealthBench provides essential assessment for medical AI systems across diverse healthcare domains and clinical decision-making contexts.

+
+
+
+
Evaluation Stats
Total Models2
Organizations1
Verified Results0
Self-Reported2
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

2 models
Top Score
57.6%
Average Score
50.0%
High Performers (80%+)
0

Top Organizations

#1OpenAI
2 models
50.0%
+
+
+
+
Leaderboard
2 models ranked by performance on HealthBench
LicenseLinks
Aug 5, 2025
Apache 2.0
57.6%
Aug 5, 2025
Apache 2.0
42.5%
+
+
+
+
Resources