CharXiv-R
multimodal
+
+
+
+
About
CharXiv-R is a reasoning-focused variant of the CharXiv benchmark that tests AI models' ability to perform complex reasoning over charts from academic papers. Unlike descriptive tasks, this benchmark requires models to analyze, compare, and draw conclusions from chart data through multi-step logical reasoning. CharXiv-R evaluates advanced analytical capabilities essential for understanding scientific visualizations and data-driven research.
+
+
+
+
Evaluation Stats
Total Models8
Organizations1
Verified Results0
Self-Reported8
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
8 models
Top Score
81.1%
Average Score
62.5%
High Performers (80%+)
1Top Organizations
#1OpenAI
8 models
62.5%
+
+
+
+
Leaderboard
8 models ranked by performance on CharXiv-R
License | Links | ||||
---|---|---|---|---|---|
Aug 7, 2025 | Proprietary | 81.1% | |||
Apr 16, 2025 | Proprietary | 78.6% | |||
Apr 16, 2025 | Proprietary | 72.0% | |||
Aug 6, 2024 | Proprietary | 58.8% | |||
Apr 14, 2025 | Proprietary | 56.8% | |||
Apr 14, 2025 | Proprietary | 56.7% | |||
Feb 27, 2025 | Proprietary | 55.4% | |||
Apr 14, 2025 | Proprietary | 40.5% |