Android Control Low_EM

multimodal
+
+
+
+
About

Android Control Low-EM is a more flexible evaluation setting of the AndroidControl benchmark that assesses AI agents' Android device control capabilities using relaxed exact-match scoring criteria. This benchmark focuses on task completion effectiveness rather than pixel-perfect precision, testing AI systems' ability to navigate mobile interfaces, understand app contexts, and accomplish user goals through natural language instruction following.

+
+
+
+
Evaluation Stats
Total Models3
Organizations1
Verified Results0
Self-Reported3
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

3 models
Top Score
93.7%
Average Score
92.8%
High Performers (80%+)
3

Top Organizations

#1Alibaba Cloud / Qwen Team
3 models
92.8%
+
+
+
+
Leaderboard
3 models ranked by performance on Android Control Low_EM
LicenseLinks
Jan 26, 2025
tongyi-qianwen
93.7%
Feb 28, 2025
Apache 2.0
93.3%
Jan 26, 2025
Apache 2.0
91.4%