XSTest

text

About

XSTest is a safety evaluation benchmark featuring 250 safe prompts and 200 unsafe prompts designed to identify exaggerated safety behaviors in large language models. This systematic test suite evaluates AI models' ability to appropriately comply with safe requests while refusing unsafe ones, highlighting systematic failure modes and challenges in building safer, more reliable language models.

Evaluation Stats

Total Models3

Organizations1

Verified Results0

Self-Reported3

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

3 models

Top Score

98.8%

Average Score

96.1%

High Performers (80%+)

Top Organizations

#1Google

3 models

96.1%

Leaderboard

3 models ranked by performance on XSTest

			License
#01Gemini 1.5 Pro	Google	May 1, 2024	Proprietary	98.8%
#02Gemini 1.5 Flash	Google	May 1, 2024	Proprietary	97.0%
#03Gemini 1.5 Flash 8B	Google	Mar 15, 2024	Proprietary	92.6%

Resources

Research Paper