Include

text
+
+
+
+
About

Include is a specialized benchmark designed to evaluate AI models' ability to incorporate specific elements, requirements, or constraints into their outputs. This benchmark tests models' capacity to follow inclusion instructions, maintain required components, and ensure comprehensive coverage of specified elements while generating responses that meet explicit inclusion criteria.

+
+
+
+
Evaluation Stats
Total Models9
Organizations2
Verified Results0
Self-Reported9
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

9 models
Top Score
81.0%
Average Score
64.8%
High Performers (80%+)
1

Top Organizations

#1Alibaba Cloud / Qwen Team
5 models
78.4%
#2Google
4 models
47.9%
+
+
+
+
Leaderboard
9 models ranked by performance on Include