Terminus

text

About

Terminus is a specialized AI agent designed for terminal environments, serving as a reference implementation and testing subject within the Terminal-Bench evaluation framework. This benchmark component demonstrates advanced terminal manipulation capabilities, including scientific computing tasks like fitting peaks in Raman spectra, showcasing sophisticated command-line interface interactions and computational problem-solving in realistic terminal scenarios.

Evaluation Stats

Total Models1

Organizations1

Verified Results0

Self-Reported1

Benchmark Details

Max Score1

Language

Performance Overview

Score distribution and top performers

Score Distribution

1 models

Top Score

25.0%

Average Score

25.0%

High Performers (80%+)

Top Organizations

#1Moonshot AI

1 model

25.0%

Leaderboard

1 models ranked by performance on Terminus

			License		Links
#01Kimi K2 Instruct	Moonshot AI	Jul 11, 2025	MIT	25.0%

Resources

Research Paper