Anthropic

Claude 3 Opus

Multimodal
Zero-eval
#1HellaSwag
#2ARC-C

by Anthropic

+
+
+
+
About

Claude 3 Opus was developed as the most capable model in the Claude 3 family, designed to set new industry benchmarks across a wide range of cognitive tasks. Built to handle complex analysis and extended tasks requiring deep reasoning, it balanced frontier intelligence with careful safety considerations, representing the flagship tier of the Claude 3 generation.

+
+
+
+
Pricing Range
Input (per 1M)$15.00 -$15.00
Output (per 1M)$75.00 -$75.00
Providers3
+
+
+
+
Timeline
AnnouncedFeb 29, 2024
ReleasedFeb 29, 2024
+
+
+
+
Specifications
Capabilities
Multimodal
+
+
+
+
License & Family
License
Proprietary
Performance Overview
Performance metrics and category breakdown

Overall Performance

11 benchmarks
Average Score
81.6%
Best Score
96.4%
High Performers (80%+)
8

Performance Metrics

Max Context Window
400.0K
Avg Throughput
87.3 tok/s
Avg Latency
0ms
+
+
+
+
All Benchmark Results for Claude 3 Opus
Complete list of benchmark scores with detailed information
ARC-C
text
0.96
96.4%
Self-reported
HellaSwag
text
0.95
95.4%
Self-reported
GSM8k
text
0.95
95.0%
Self-reported
MGSM
text
0.91
90.7%
Self-reported
MMLU
text
0.87
86.8%
Self-reported
BIG-Bench Hard
text
0.87
86.8%
Self-reported
HumanEval
text
0.85
84.9%
Self-reported
DROP
text
0.83
83.1%
Self-reported
MMLU-Pro
text
0.69
68.5%
Self-reported
MATH
text
0.60
60.1%
Self-reported
Showing 1 to 10 of 11 benchmarks
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+