BFCL v2
Multilingual
text
+
+
+
+
About
BFCL v2 (Berkeley Function Calling Leaderboard v2) is an enhanced version featuring improved evaluation criteria and expanded test coverage for function calling capabilities. Building on the original BFCL framework, it incorporates refined metrics for accuracy assessment, enhanced multi-turn scenarios, and additional real-world function calling challenges. The benchmark maintains focus on tool usage evaluation while providing more comprehensive assessment of LLM function calling abilities.
+
+
+
+
Evaluation Stats
Total Models5
Organizations2
Verified Results0
Self-Reported5
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
5 models
Top Score
77.3%
Average Score
71.1%
High Performers (80%+)
0Top Organizations
#1Meta
2 models
72.2%
#2NVIDIA
3 models
70.5%
+
+
+
+
Leaderboard
5 models ranked by performance on BFCL v2
License | Links | ||||
---|---|---|---|---|---|
Dec 6, 2024 | Llama 3.3 Community License Agreement | 77.3% | |||
Apr 7, 2025 | Llama 3.1 Community License | 74.1% | |||
Mar 18, 2025 | Llama 3.1 Community License | 73.7% | |||
Sep 25, 2024 | Llama 3.2 Community License | 67.0% | |||
Mar 18, 2025 | Llama 3.1 Community License | 63.6% |