MuSR

text

About

MUSR (Multimodal Understanding of Spatial Reasoning) is a benchmark designed to evaluate multimodal models' spatial reasoning capabilities across visual and textual modalities. It tests models' ability to understand spatial relationships, navigate complex spatial scenarios, and perform geometric reasoning tasks requiring integration of visual perception and spatial cognition.

Evaluation Stats

Total Models1

Organizations1

Verified Results0

Self-Reported1

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

1 models

Top Score

76.4%

Average Score

76.4%

High Performers (80%+)

Top Organizations

#1Moonshot AI

1 model

76.4%

Leaderboard

1 models ranked by performance on MuSR

			License		Links
#01Kimi K2 Instruct	Moonshot AI	Jul 11, 2025	MIT	76.4%

Resources

Research Paper