CharadesSTA

multimodal
+
+
+
+
About

CharadesSTA is a video understanding benchmark that evaluates AI models' ability to perform spatio-temporal action localization in videos. The benchmark tests models' capability to identify and temporally locate human activities within video sequences, requiring both spatial and temporal reasoning. CharadesSTA challenges AI systems to understand complex human actions in realistic video scenarios, making it essential for video analysis and action recognition research.

+
+
+
+
Evaluation Stats
Total Models2
Organizations1
Verified Results0
Self-Reported2
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

2 models
Top Score
54.2%
Average Score
48.9%
High Performers (80%+)
0

Top Organizations

#1Alibaba Cloud / Qwen Team
2 models
48.9%
+
+
+
+
Leaderboard
2 models ranked by performance on CharadesSTA
LicenseLinks
Feb 28, 2025
Apache 2.0
54.2%
Jan 26, 2025
Apache 2.0
43.6%
+
+
+
+
Resources