MedXpertQA

multimodal

About

MedXpertQA is a highly challenging medical question-answering benchmark that evaluates expert-level medical knowledge and advanced clinical reasoning. Created by Tsinghua University, this comprehensive benchmark tests AI models' ability to handle complex medical scenarios requiring specialist-level understanding, diagnostic reasoning, and evidence-based decision-making across various medical specialties and clinical contexts.

Evaluation Stats

Total Models1

Organizations1

Verified Results0

Self-Reported1

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

1 models

Top Score

18.8%

Average Score

18.8%

High Performers (80%+)

Top Organizations

#1Google

1 model

18.8%

Leaderboard

1 models ranked by performance on MedXpertQA

			License		Links
#01MedGemma 4B IT	Google	May 20, 2025	Health AI Developer Foundations terms of use	18.8%

Resources

Research Paper