PointGrounding

multimodal

About

PointGrounding is a multimodal benchmark that evaluates AI models' ability to ground language descriptions to specific spatial locations in visual scenes through pointing tasks. This comprehensive evaluation tests visual-linguistic understanding, spatial reasoning, and precise localization capabilities, challenging models to accurately identify and point to objects or regions described in natural language within complex visual environments.

Evaluation Stats

Total Models1

Organizations1

Verified Results0

Self-Reported1

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

1 models

Top Score

66.5%

Average Score

66.5%

High Performers (80%+)

Top Organizations

#1Alibaba Cloud / Qwen Team

1 model

66.5%

Leaderboard

1 models ranked by performance on PointGrounding

			License		Links
#01Qwen2.5-Omni-7B	Alibaba Cloud / Qwen Team	Mar 27, 2025	Apache 2.0	66.5%

Resources

Research Paper