All Benchmarks

Explore all 342 benchmarks for evaluating language models across different capabilities and domains

PropertiesLinks
text
en
126
14
88.4%
57.3%
text
en
80
15
92.5%
79.8%
text
en
68
12
85.0%
65.6%
text
en
64
11
97.9%
67.0%
text
en
63
12
94.5%
80.6%
multimodal
en
52
11
84.2%
64.1%
text
en
48
10
80.4%
46.5%
text
en
46
15
97.3%
87.8%
text
en
45
11
95.8%
73.2%
text
en
45
11
100.0%
67.6%
Showing 1 to 10 of 342 benchmarks
...