CharXiv-D
multimodal
+
+
+
+
About
CharXiv-D is a natural and challenging benchmark featuring charts collected from arXiv papers paired with human-curated questions for chart understanding evaluation. This dataset variant focuses on descriptive questions about chart content, testing AI models' ability to accurately interpret and describe visual data representations found in academic publications. CharXiv-D measures fundamental chart comprehension skills required for scientific document analysis.
+
+
+
+
Evaluation Stats
Total Models5
Organizations1
Verified Results0
Self-Reported5
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
5 models
Top Score
90.0%
Average Score
85.1%
High Performers (80%+)
4Top Organizations
#1OpenAI
5 models
85.1%
+
+
+
+
Leaderboard
5 models ranked by performance on CharXiv-D
License | Links | ||||
---|---|---|---|---|---|
Feb 27, 2025 | Proprietary | 90.0% | |||
Apr 14, 2025 | Proprietary | 88.4% | |||
Apr 14, 2025 | Proprietary | 87.9% | |||
Aug 6, 2024 | Proprietary | 85.3% | |||
Apr 14, 2025 | Proprietary | 73.9% |