VocalSound

audio
+
+
+
+
About

VocalSound is an audio classification benchmark featuring 21,024 crowdsourced recordings of human vocal sounds including laughter, sighs, coughs, throat clearing, sneezes, and sniffs from 3,365 unique subjects. This comprehensive evaluation tests AI models' ability to recognize and classify non-verbal human vocalizations, including detailed metadata on speaker demographics, health conditions, and acoustic characteristics.

+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

1 models
Top Score
93.9%
Average Score
93.9%
High Performers (80%+)
1

Top Organizations

#1Alibaba Cloud / Qwen Team
1 model
93.9%
+
+
+
+
Leaderboard
1 models ranked by performance on VocalSound
LicenseLinks
Mar 27, 2025
Apache 2.0
93.9%
+
+
+
+
Resources