MultiLF

text

About

MultiLF is a multilingual logical form benchmark designed to evaluate language models' ability to understand and generate logical representations across multiple languages. It tests models' capacity for cross-lingual semantic parsing, logical reasoning, and formal language understanding in diverse linguistic contexts, providing comprehensive assessment of multilingual logical comprehension capabilities.

Evaluation Stats

Total Models2

Organizations1

Verified Results0

Self-Reported2

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

2 models

Top Score

73.0%

Average Score

72.4%

High Performers (80%+)

Top Organizations

#1Alibaba Cloud / Qwen Team

2 models

72.4%

Leaderboard

2 models ranked by performance on MultiLF

			License		Links
#01Qwen3 32B	Alibaba Cloud / Qwen Team	Apr 29, 2025	Apache 2.0	73.0%
#02Qwen3 235B A22B	Alibaba Cloud / Qwen Team	Apr 29, 2025	Apache 2.0	71.9%