MGSM

Multilingual
text
+
+
+
+
About

MGSM (Multilingual Grade School Math) is a multilingual mathematical reasoning benchmark created by translating 250 grade-school math problems from GSM8K into ten typologically diverse languages including Bengali, Chinese, French, German, Japanese, Russian, Spanish, Swahili, Telugu, and Thai. It evaluates language models' mathematical reasoning capabilities across different languages using chain-of-thought prompting, testing both mathematical competency and multilingual understanding in numerical problem-solving contexts.

+
+
+
+
Evaluation Stats
Total Models31
Organizations6
Verified Results0
Self-Reported30
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

31 models
Top Score
92.3%
Average Score
77.9%
High Performers (80%+)
19

Top Organizations

#1Anthropic
6 models
86.4%
#2OpenAI
8 models
83.6%
#3Alibaba Cloud / Qwen Team
1 model
83.5%
#4Meta
6 models
81.3%
#5Google
6 models
67.3%
+
+
+
+
Leaderboard
31 models ranked by performance on MGSM
LicenseLinks
Apr 5, 2025
Llama 4 Community License Agreement
92.3%
Jan 30, 2025
Proprietary
92.0%
Oct 22, 2024
Proprietary
91.6%
Jun 21, 2024
Proprietary
91.6%
Dec 6, 2024
Llama 3.3 Community License Agreement
91.1%
Sep 12, 2024
Proprietary
90.8%
Feb 29, 2024
Proprietary
90.7%
Apr 5, 2025
Llama 4 Community License Agreement
90.6%
May 13, 2024
Proprietary
90.5%
Dec 17, 2024
Proprietary
89.3%
Showing 1 to 10 of 31 models
+
+
+
+
Resources