MMVetGPT4Turbo

multimodal

About

MMVet-GPT4Turbo is a specialized variant of the MM-Vet benchmark specifically optimized for evaluation using GPT-4 Turbo as the assessment model. This version leverages GPT-4 Turbo's enhanced capabilities to provide more accurate and reliable evaluation of large multimodal models across the same six core vision-language capabilities and their integrations.

Evaluation Stats

Total Models1

Organizations1

Verified Results0

Self-Reported1

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

1 models

Top Score

74.0%

Average Score

74.0%

High Performers (80%+)

Top Organizations

#1Alibaba Cloud / Qwen Team

1 model

74.0%

Leaderboard

1 models ranked by performance on MMVetGPT4Turbo

			License		Links
#01Qwen2-VL-72B-Instruct	Alibaba Cloud / Qwen Team	Aug 29, 2024	tongyi-qianwen	74.0%

Resources

Research Paper