Loading…
7-8 April, 2025
Paris, France
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference Europe 2026 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in CEST (UTC/GMT +2). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."
Tuesday April 7, 2026 14:30 - 14:40 CEST


Combo kernels are a compiler optimization in PyTorch Inductor that horizontally fuses multiple independent operations into a single Triton kernel launch, reducing GPU kernel launch overhead and improving memory locality.

The Problem: Models generate many small, independent operations like weight preprocessing and tensor copies. Each launch incurs overhead. For models with many such operations, this becomes a bottleneck.

The Solution: Combo kernels combine multiple operations into one kernel using a dispatch mechanism. A single program ID routes execution to the appropriate subkernel based on cumulative block boundaries. This eliminates redundant launches while preserving correctness.

Key Innovations:

Per-subkernel block dimensions: Each subkernel gets its own optimized block size instead of sharing one size across all, enabling better autotuning.

Flattened grid dispatch: We collapse the multi-dimensional block grid into a single dimension.

Results: On H100 GPUs, combo kernels deliver geomean speedups of +7.38% for HuggingFace, and +5.97% for TorchBench. The optimization is enabled by default in the vLLM repository for LLM inference acceleration.
Speakers
avatar for Elias Ellison

Elias Ellison

Software Engineer, Meta
Elias has been working on the PyTorch team for four years, most recently on the torch.compile stack
avatar for Karthick Panner Selvam

Karthick Panner Selvam

Software Engineer, Meta
Karthick Panner Selvam is a SWE at Meta Superintelligence Lab, working on the PyTorch compiler team to enhance performance and scalability for large models. He earned his PhD in Machine for Systems at the University of Luxembourg, collaborating with Google DeepMind, ECMWF, and Frontier... Read More →
Tuesday April 7, 2026 14:30 - 14:40 CEST
Master Stage
  Frameworks & Compilers
  • Audience Level Any
  • Slides Attached Yes

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link