24 February 2026
“This Kernel Was Faster Yesterday” — In Pursuit of High-Fidelity GPU Kernel Benchmarking
GPU timing is deceptively hard. Power limits, thermal state, clock behavior, caching, and measurement method all affect results in ways that aren't obvious. We explored sources of timing variation to obtain more reliable results for kernel benchmarking, which is especially important for automated RL-based kernel optimization systems.