Cybersecurity CTFs
text
+
+
+
+
About
Cybersecurity CTFs is a benchmark based on Capture the Flag challenges that evaluates AI models' cybersecurity problem-solving capabilities. This benchmark tests models' ability to identify vulnerabilities, solve security puzzles, and perform penetration testing tasks through realistic cybersecurity scenarios. Cybersecurity CTFs measures AI systems' capacity to handle complex security challenges requiring technical expertise and creative problem-solving.
+
+
+
+
Evaluation Stats
Total Models1
Organizations1
Verified Results0
Self-Reported1
+
+
+
+
Benchmark Details
Max Score1
Language
en
+
+
+
+
Performance Overview
Score distribution and top performers
Score Distribution
1 models
Top Score
28.7%
Average Score
28.7%
High Performers (80%+)
0Top Organizations
#1OpenAI
1 model
28.7%
+
+
+
+
Leaderboard
1 models ranked by performance on Cybersecurity CTFs
License | Links | ||||
---|---|---|---|---|---|
Sep 12, 2024 | Proprietary | 28.7% |