- Home
- /
- Benchmarks
- /
- Finance Agent
Finance Agent
Finance
+
+
+
+
About
Finance Agent benchmarks LLMs on realistic financial analysis tasks including reading SEC filings, earnings analysis, financial modeling, and quantitative research queries.
+
+
+
+
Evaluation Stats
Total Models10
Organizations5
Verified Results0
Self-Reported0
+
+
+
+
Benchmark Details
Max Score100
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
10 models
Top Score
63.3%
Average Score
57.4%
High Performers (80%+)
0Top Organizations
#1Anthropic
4 models
59.2%
#2Google DeepMind
2 models
57.5%
#3OpenAI
2 models
57.2%
#4Alibaba / Qwen
1 model
54.5%
#5xAI
1 model
53.5%
+
+
+
+
Leaderboard
10 models ranked by performance on Finance Agent
| License | Links | ||||
|---|---|---|---|---|---|
| Feb 17, 2026 | Proprietary | 63.3% | |||
| Feb 1, 2026 | Proprietary | 60.1% | |||
| Feb 19, 2026 | Proprietary | 59.7% | |||
| Dec 11, 2025 | Proprietary | 59.0% | |||
| Nov 1, 2025 | Proprietary | 58.8% | |||
| Nov 1, 2025 | Proprietary | 55.3% | |||
| Nov 18, 2025 | Proprietary | 55.2% | |||
| Sep 29, 2025 | Proprietary | 54.5% | |||
| Feb 16, 2026 | Apache 2.0 | 54.5% | |||
| Jul 10, 2025 | Proprietary | 53.5% |
+
+
+
+
Additional Metrics
Extended metrics for top models on Finance Agent
| Model | Score | Cost/Test | Latency |
|---|---|---|---|
| Claude Sonnet 4.6 | 63.3 | $1.44 | 348.99s |
| Claude Opus 4.6 | 60.1 | $1.11 | 289.73s |
| Gemini 3.1 Pro | 59.7 | $0.87 | 265.72s |
| GPT-5.2 | 59.0 | $0.98 | 587.16s |
| Claude Opus 4.5 | 58.8 | $1.5 | 181.87s |
| GPT-5.1 | 55.3 | $0.47 | 578.06s |
| Gemini 3 Pro | 55.2 | $0.56 | 183.62s |
| Claude Sonnet 4.5 | 54.5 | $1.1 | 202.07s |
| Qwen3.5-397B-A17B | 54.5 | $0.24 | 360.48s |
| Grok 4 | 53.5 | $1.07 | 321.04s |