LVBench

multimodal

About

LVBench is a multimodal evaluation benchmark that tests AI models' ability to understand and reason about visual content in extended contexts. This benchmark evaluates models' capacity to process long visual sequences, maintain visual memory, and perform reasoning tasks that require sustained attention to visual information across extended multimodal inputs and complex visual narratives.

Evaluation Stats

Total Models5

Organizations2

Verified Results0

Self-Reported5

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

5 models

Top Score

49.0%

Average Score

44.7%

High Performers (80%+)

Top Organizations

#1Alibaba Cloud / Qwen Team

3 models

47.2%

#2Amazon

2 models

41.0%

Leaderboard

5 models ranked by performance on LVBench

			License
#01Qwen2.5 VL 32B Instruct	Alibaba Cloud / Qwen Team	Feb 28, 2025	Apache 2.0	49.0%
#02Qwen2.5 VL 72B Instruct	Alibaba Cloud / Qwen Team	Jan 26, 2025	tongyi-qianwen	47.3%
#03Qwen2.5 VL 7B Instruct	Alibaba Cloud / Qwen Team	Jan 26, 2025	Apache 2.0	45.3%
#04Nova Pro	Amazon	Nov 20, 2024	Proprietary	41.6%
#05Nova Lite	Amazon	Nov 20, 2024	Proprietary	40.4%

Resources

Research Paper