10 Performance

FFmpeg libswscale Re-implementation

Results

#ModelCorrectnessAvgBest
1
GPT-5.4
Codex
0/50.2670.433
2
Claude Opus 4.6
Claude Code
0/50.1470.200
3
Qwen3.6-Plus
Qwen Code
0/50.0600.167
4
Gemini 3.1 Pro
Gemini CLI
0/50.0000.000
5
Kimi K2.5
Kimi CLI
0/50.0000.000

Background

libswscale is the FFmpeg component responsible for image resizing and pixel-format conversion. Callers reach it through a stable C API, so this task is about reproducing both the library interface and the resulting image behavior.

The reference build used in the task is FFmpeg's scalar C path rather than its assembly-optimized variants, which keeps the comparison on a from-scratch rewrite instead of architecture-specific hand tuning that already exists upstream.

Task

Starting from FFmpeg's libswscale reference (compiled without assembly optimizations), the agent must rebuild it as a C-compatible shared library in Zig or Rust. That includes both image scaling and pixel-format conversion, with enough fidelity that callers can treat the submission as a drop-in replacement.

  • Implement the public C API expected by the hidden harness.
  • Handle resizing and format conversion across a mix of image shapes and color layouts.
  • Beat FFmpeg's scalar C baseline once correctness thresholds are satisfied.

Evaluation

The hidden verifier first runs correctness workloads covering format conversion and image scaling. Only submissions that clear per-plane PSNR thresholds are eligible for the benchmark phase. If any correctness workload fails, the score is zero; otherwise the verifier takes a geometric mean of hidden benchmark speedups against FFmpeg's scalar C baseline.

  • Format-only workloads use a stricter quality threshold than scaling workloads.
  • Random-seeded images are used so the task is about implementation fidelity, not memorizing test fixtures.
  • Crashes or failures on benchmark workloads are heavily punished by the geometric mean.

Environment And Constraints

The task runs on CPUs only with 8 CPUs, 64 GB RAM, no GPUs, and no internet access. The reference library is compiled without assembly so the baseline is fair to a from-scratch rewrite rather than a duel against FFmpeg's most architecture-specific hand tuning.