OSWorld Screenshot-only
multimodal
+
+
+
+
About
OSWorld Screenshot-Only is a visual-focused variant of the OSWorld benchmark that evaluates multimodal agents' abilities using only screenshot inputs without additional contextual information. This challenging version tests agents' capacity to understand and interact with computer interfaces through pure visual perception, requiring sophisticated computer vision and interface understanding capabilities.
+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
1 models
Top Score
14.9%
Average Score
14.9%
High Performers (80%+)
0Top Organizations
#1Anthropic
1 model
14.9%
+
+
+
+
Leaderboard
1 models ranked by performance on OSWorld Screenshot-only
License | Links | ||||
---|---|---|---|---|---|
Oct 22, 2024 | Proprietary | 14.9% |