Nexus

text
+
+
+
+
About

Nexus is a comprehensive benchmark designed to evaluate models across multiple interconnected tasks and domains. It tests systems' ability to handle complex, multi-faceted problems that require integration of different capabilities, reasoning across task boundaries, and maintaining consistency across diverse evaluation scenarios in a unified assessment framework.

+
+
+
+
Evaluation Stats
Total Models4
Organizations1
Verified Results0
Self-Reported4
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers

Score Distribution

4 models
Top Score
58.7%
Average Score
47.0%
High Performers (80%+)
0

Top Organizations

#1Meta
4 models
47.0%
+
+
+
+
Leaderboard
4 models ranked by performance on Nexus
LicenseLinks
Jul 23, 2024
Llama 3.1 Community License
58.7%
Jul 23, 2024
Llama 3.1 Community License
56.7%
Jul 23, 2024
Llama 3.1 Community License
38.5%
Sep 25, 2024
Llama 3.2 Community License
34.3%
+
+
+
+
Resources