ScreenSpot

multimodal
+
+
+
+
About

ScreenSpot is a GUI grounding benchmark that evaluates AI models' ability to locate and identify specific interface elements within screenshots of computer applications. This foundational evaluation tests visual understanding of graphical user interfaces, spatial reasoning for element localization, and the capacity to understand UI components across different software applications and operating systems.

+
+
+
+
Evaluation Stats
Total Models3
Organizations1
Verified Results0
Self-Reported3
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

3 models
Top Score
88.5%
Average Score
86.8%
High Performers (80%+)
3

Top Organizations

#1Alibaba Cloud / Qwen Team
3 models
86.8%
+
+
+
+
Leaderboard
3 models ranked by performance on ScreenSpot
LicenseLinks
Feb 28, 2025
Apache 2.0
88.5%
Jan 26, 2025
tongyi-qianwen
87.1%
Jan 26, 2025
Apache 2.0
84.7%
+
+
+
+
Resources