MGSM
Multilingual
text
+
+
+
+
About
MGSM (Multilingual Grade School Math) is a multilingual mathematical reasoning benchmark created by translating 250 grade-school math problems from GSM8K into ten typologically diverse languages including Bengali, Chinese, French, German, Japanese, Russian, Spanish, Swahili, Telugu, and Thai. It evaluates language models' mathematical reasoning capabilities across different languages using chain-of-thought prompting, testing both mathematical competency and multilingual understanding in numerical problem-solving contexts.
+
+
+
+
Evaluation Stats
Total Models31
Organizations6
Verified Results0
Self-Reported30
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
31 models
Top Score
92.3%
Average Score
77.9%
High Performers (80%+)
19Top Organizations
#1Anthropic
6 models
86.4%
#2OpenAI
8 models
83.6%
#3Alibaba Cloud / Qwen Team
1 model
83.5%
#4Meta
6 models
81.3%
#5Google
6 models
67.3%
+
+
+
+
Leaderboard
31 models ranked by performance on MGSM
License | Links | ||||
---|---|---|---|---|---|
Apr 5, 2025 | Llama 4 Community License Agreement | 92.3% | |||
Jan 30, 2025 | Proprietary | 92.0% | |||
Oct 22, 2024 | Proprietary | 91.6% | |||
Jun 21, 2024 | Proprietary | 91.6% | |||
Dec 6, 2024 | Llama 3.3 Community License Agreement | 91.1% | |||
Sep 12, 2024 | Proprietary | 90.8% | |||
Feb 29, 2024 | Proprietary | 90.7% | |||
Apr 5, 2025 | Llama 4 Community License Agreement | 90.6% | |||
May 13, 2024 | Proprietary | 90.5% | |||
Dec 17, 2024 | Proprietary | 89.3% |
Showing 1 to 10 of 31 models