Android Control Low_EM
multimodal
+
+
+
+
About
Android Control Low-EM is a more flexible evaluation setting of the AndroidControl benchmark that assesses AI agents' Android device control capabilities using relaxed exact-match scoring criteria. This benchmark focuses on task completion effectiveness rather than pixel-perfect precision, testing AI systems' ability to navigate mobile interfaces, understand app contexts, and accomplish user goals through natural language instruction following.
+
+
+
+
Evaluation Stats
Total Models3
Organizations1
Verified Results0
Self-Reported3
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
3 models
Top Score
93.7%
Average Score
92.8%
High Performers (80%+)
3Top Organizations
#1Alibaba Cloud / Qwen Team
3 models
92.8%
+
+
+
+
Leaderboard
3 models ranked by performance on Android Control Low_EM
License | Links | ||||
---|---|---|---|---|---|
Jan 26, 2025 | tongyi-qianwen | 93.7% | |||
Feb 28, 2025 | Apache 2.0 | 93.3% | |||
Jan 26, 2025 | Apache 2.0 | 91.4% |