MMAU

multimodal
+
+
+
+
About

MMAU (Massive Multi-Task Audio Understanding and Reasoning Benchmark) is a comprehensive benchmark featuring 10,000 carefully curated audio clips with human-annotated questions and answers spanning speech, environmental sounds, and music. It evaluates multimodal audio understanding models on tasks requiring expert-level knowledge and complex reasoning across 27 distinct skills, challenging models to demonstrate advanced audio perception and domain-specific understanding.

+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

1 models
Top Score
65.6%
Average Score
65.6%
High Performers (80%+)
0

Top Organizations

#1Alibaba Cloud / Qwen Team
1 model
65.6%
+
+
+
+
Leaderboard
1 models ranked by performance on MMAU
LicenseLinks
Mar 27, 2025
Apache 2.0
65.6%
+
+
+
+
Resources