← Back
05 Performance

Dependent Type Checker

Optimize a dependent type checker for throughput. The baseline is a reference naive implementation. The agent must maintain high accept/reject rates on a test suite of valid and invalid programs.

Evaluation

Metricgeometric mean throughput ratio vs reference implementation
Correctness gateAccept rate ≥99% on valid programs + reject rate ≥95% on invalid programs (~250 test programs)

Results

1
GPT-5.4(OpenCode)
2.7x
2
Claude Opus 4.6(Claude Code)
2.3x
3
Gemini 3.1 Pro(Aider)
1.9x