MRCR

text

About

MRCR (Multi-hop Reasoning for Conversational Recommendation) is a benchmark designed to evaluate models' ability to perform multi-step reasoning in conversational recommendation scenarios. It tests systems' capability to understand user preferences, conduct multi-hop reasoning across knowledge graphs, and provide accurate recommendations through natural dialogue interactions.

Evaluation Stats

Total Models6

Organizations1

Verified Results0

Self-Reported6

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

6 models

Top Score

93.0%

Average Score

67.2%

High Performers (80%+)

Top Organizations

#1Google

6 models

67.2%

Leaderboard

6 models ranked by performance on MRCR

			License
#01Gemini 2.5 Pro	Google	May 20, 2025	Proprietary	93.0%
#02Gemini 1.5 Pro	Google	May 1, 2024	Proprietary	82.6%
#03Gemini 1.5 Flash	Google	May 1, 2024	Proprietary	71.9%
#04Gemini 2.0 Flash	Google	Dec 1, 2024	Proprietary	69.2%
#05Gemini 1.5 Flash 8B	Google	Mar 15, 2024	Proprietary	54.7%
#06Gemini 2.5 Flash	Google	May 20, 2025	Proprietary	32.0%

Resources

Research Paper