MusicCaps
multimodal
+
+
+
+
About
MusicCaps is a music captioning dataset and benchmark for evaluating models' ability to generate descriptive text about musical audio. It features high-quality human-written captions describing musical characteristics, instruments, genres, and acoustic properties, enabling assessment of models' musical understanding and audio-to-text generation capabilities.
+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
1 models
Top Score
32.8%
Average Score
32.8%
High Performers (80%+)
0Top Organizations
#1Alibaba Cloud / Qwen Team
1 model
32.8%
+
+
+
+
Leaderboard
1 models ranked by performance on MusicCaps
License | Links | ||||
---|---|---|---|---|---|
Mar 27, 2025 | Apache 2.0 | 32.8% |