MMAU
multimodal
+
+
+
+
About
MMAU (Massive Multi-Task Audio Understanding and Reasoning Benchmark) is a comprehensive benchmark featuring 10,000 carefully curated audio clips with human-annotated questions and answers spanning speech, environmental sounds, and music. It evaluates multimodal audio understanding models on tasks requiring expert-level knowledge and complex reasoning across 27 distinct skills, challenging models to demonstrate advanced audio perception and domain-specific understanding.
+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
1 models
Top Score
65.6%
Average Score
65.6%
High Performers (80%+)
0Top Organizations
#1Alibaba Cloud / Qwen Team
1 model
65.6%
+
+
+
+
Leaderboard
1 models ranked by performance on MMAU
License | Links | ||||
---|---|---|---|---|---|
Mar 27, 2025 | Apache 2.0 | 65.6% |