Tau2 Telecom
text
+
+
+
+
About
TAU2-telecom is the telecommunications sector component of the τ²-Bench framework, testing conversational agents in telecom customer service scenarios. This specialized benchmark evaluates AI agents' ability to handle telecommunications-specific tasks including service plans, technical support, billing inquiries, and network issues while maintaining accuracy in tool usage and following telecom industry protocols and policies.
+
+
+
+
Evaluation Stats
Total Models8
Organizations3
Verified Results0
Self-Reported8
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
8 models
Top Score
96.7%
Average Score
51.6%
High Performers (80%+)
1Top Organizations
#1Moonshot AI
2 models
65.8%
#2OpenAI
3 models
59.5%
#3Alibaba Cloud / Qwen Team
3 models
34.2%
+
+
+
+
Leaderboard
8 models ranked by performance on Tau2 Telecom
License | Links | ||||
---|---|---|---|---|---|
Aug 7, 2025 | Proprietary | 96.7% | |||
Sep 5, 2025 | MIT | 65.8% | |||
Jul 11, 2025 | MIT | 65.8% | |||
Apr 16, 2025 | Proprietary | 58.2% | |||
Jul 25, 2025 | Apache 2.0 | 45.6% | |||
Sep 10, 2025 | Apache 2.0 | 43.9% | |||
Aug 6, 2024 | Proprietary | 23.5% | |||
Sep 10, 2025 | Apache 2.0 | 13.2% |