Creative Writing v3
text
+
+
+
+
About
Creative Writing v3 is an LLM-judged benchmark that evaluates Large Language Models' creative writing capabilities using advanced rubric scoring and Elo rating systems. Judged by state-of-the-art models like Sonnet 4, it assesses writing quality across multiple dimensions including style, originality, coherence, and engagement. The benchmark measures AI systems' ability to produce compelling, creative content while avoiding repetition and maintaining high literary standards.
+
+
+
+
Evaluation Stats
Total Models3
Organizations1
Verified Results0
Self-Reported3
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
3 models
Top Score
87.5%
Average Score
86.3%
High Performers (80%+)
3Top Organizations
#1Alibaba Cloud / Qwen Team
3 models
86.3%
+
+
+
+
Leaderboard
3 models ranked by performance on Creative Writing v3
License | Links | ||||
---|---|---|---|---|---|
Jul 22, 2025 | Apache 2.0 | 87.5% | |||
Jul 25, 2025 | Apache 2.0 | 86.1% | |||
Sep 10, 2025 | Apache 2.0 | 85.3% |