arrow_back View All Dates
10:35 • Lightning Talk: Live Migration of PyTorch GPU Nodes From Azure To European Clouds - Mike Krom, Acf Cyber Solutions
10:50 • Lightning Talk: Step-Aligned Telemetry for Distributed PyTorch Training (Time & Memory Attribution Across Ranks) - Abhinav Srivastav, TraceOpt
11:05 • Lightning Talk: KV-Cache Centric Inference: Building a State-Aware Serving Platform With Llm-d and VLLM - Maroon Ayoub & Martin Hickey, IBM Research
11:20 • Lightning Talk: Not All Tokens Are Equal: Semantic KV-Cache for Agentic LLM Serving - Maroon Ayoub, IBM Research & Hyunkyun Moon, moreh
11:35 • Optimizing Large MoE Inference on NVIDIA Blackwell: NVFP4, ADP, and DualPipe Strategies - Julien Demouth, NVIDIA
13:30 • Lightning Talk: From Hugging Face To Handheld: Scaling LLM Deployment With LiteRT Generative API - Cormac Brick & Weiyi Wang, Google
13:45 • Lightning Talk: Slash LLM Cold-Start Times by Pre-distributing GPU Caches - Billy McFall & Maryam Tahhan, Red Hat
14:00 • Lightning Talk: Backpropagation-Free Optimization in PyTorch - Andrii Krutsylo, Polish Academy of Sciences
14:15 • Lightning Talk: Inside VLLM's KV Offloading Connector: Async Memory Transfers for Higher Inference Throughput - Nicolò Lucchesi, Red Hat
14:30 • Lightning Talk: Every Millisecond Counts: The Fine-tuning Journey of an Ultra-Efficient PyTorch Model for the Edge - Pavel Macenauer, NXP Semiconductors
14:45 • Lightning Talk: Full-Stack PyTorch Robotics VLA: From Data To Edge Via ExecuTorch/OpenVINO - Samet Akcay & Dmitriy Pastushenkov, Intel
15:25 • Beyond the Theory: What Actually Breaks When You Scale Your Disaggregated Pytorch Models - Ekin Karabulut & Ron Kahn, NVIDIA
15:55 • Lightning Talk: Why Logging Isn’t Enough: Making PyTorch Training Regressions Visible in Practice - Sahana Venkatesh, Wayve
16:10 • Lightning Talk: Ball Tracking and Detection in Soccer Videos - Comparison of VLMs and Traditional Pipelines - Maciej Szymkowski, Future Processing
16:25 • De-mystifying PyTorch for ASICs: When (and Why) To Move Your Development To AI Accelerators - Alpha Romer Coma, Kollab Philippines
10:35 • How To Write C++ Extensions in 2026 - Jane Xu, Meta & Mikayla Gawarecki, Meta
11:05 • Bringing PyTorch Monarch to AMD GPUs: Single-Controller Distributed Training on ROCm - Liz Li & Zachary Streeter, AMD
11:35 • Lightning Talk: Enabling the Audio Modality for Language Models - Eustache Le Bihan, Hugging Face
13:30 • Optimizing CPU LLM Inference in PyTorch: Lessons From VLLM - Crefeda Rodrigues, Arm Limited & Fadi Arafeh, Arm
14:00 • Lightning Talk: Debugging the Undebuggable: Introducing Torch.distributed.debug - Tristan Rice, Meta, PyTorch
14:15 • Lightning Talk: Scaling Recommendation Systems To 2K GPUs and Beyond - Zain Huda, Meta
14:30 • From Responses To Trajectories: Multi-Turn and Multi-Environment Reinforcement Learning - Kashif Rasul & Sergio Paniego Blanco, Hugging Face
15:25 • Lightning Talk: Trinity Large - Torchtitan on 2000+ B300s - Matej Sirovatka, Prime Intellect
15:40 • Lightning Talk: Faster Than SOTA Kernels in Torch.compile With Subgraph Fusions and Custom Op Autotuning - Elias Ellison & Paul Zhang, Meta
15:55 • DualPipe from Scratch: Implementing DeepSeek's 5D Parallelism in PyTorch - Dev Jadhav, ING Bank
16:25 • Lightning Talk: Bridging the Gap: Engineering Compliant "Glass Box" Medical AI With PyTorch - Muhammad Saqib Hussain, Neurosonic & Mohaddisa Maryam, Neurosonic Academy
10:35 • Beyond JSON-RPC: Scaling Model Context Protocols With gRPC in the PyTorch Ecosystem - Ashesh Vidyut & Madhav Bissa, Google
11:05 • Lightning Talk: Accelerating PyTorch Models With Torch.compile's C++ Wrapper Mode - Bin Bao, Meta
11:20 • Lightning Talk: Building AI That Ops Teams Actually Trust - Robert King, Chronosphere / Palo Alto Networks
11:35 • Accelerating Complex-Valued Tensors With Torch.compile - Hameer Abbasi, OpenTeams Inc.
13:30 • PyTorch on RISC-V: From Cross-Compilation To Native CI - Ludovic Henry, Meta
14:00 • Lightning Talk: Pluggable PyTorch LLM Inference Architecture With VLLM and AWS Neuron Backends - Yahav Biran, Annapurna Labs & Maen Suleiman, Amazon
14:15 • Lightning Talk: Distributed AI Without the Infrastructure Tax - Yahav Biran, Annapurna Labs & Maen Suleiman, Amazon
14:30 • Lightning Talk: Torch-Spyre: Compiling To a Multi-core Dataflow Accelerator With Inductor - David Grove & Olivier Tardieu, IBM
14:45 • Lightning Talk: Building a PyTorch‑native VLLM Plugin for IBM Spyre - Thomas Parnell, IBM Research & Thomas Ortner, IBM Research Europe - Zurich
15:25 • Building Trust for Users and Regulators Alike: A Cost-Efficient PyTorch Path To Compliance-as-Code - Raja Gopal Hari Vijay, Zoho Corporation
15:55 • Sponsored Session: Fault-Tolerant Training: How We Build Reliable Clusters for Distributed AI Workloads - Cyril Konkratenko & Maurits de Groot, Nebius
09:00 • Keynote: PyTorch CTO - Matt White, Global CTO of AI, Linux Foundation
09:10 • Keynote: vLLM & Ray Updates - Tyler Michael Smith, Chief Architect - Inference Engineering, Red Hat & Artur Niederfahrenhorst, Member of Technical Staff,Anyscale
09:25 • Keynote: The Hub as Infrastructure. From Open PyTorch Models, to a Safe and Performant Distribution Hub - Lysandre Debut, Chief Open-Source Officer, Hugging Face
09:45 • Sponsored Keynote: Open Source Infrastructure for the AI Native Era - Jonathan Bryce, Executive Director, Cloud Native Computing Foundation
09:50 • Keynote: Gemma 4: Compacting Intelligence for the Edge - Léonard Hussenot, Research Scientist, Google Deepmind
10:35 • Lightning Talk: Monarch: An API To Your Supercomputer - Marius Eriksen, Meta
10:50 • Lightning Talk: Achieving SOTA GEMM Performance: A CuTeDSL Backend for PyTorch Inductor - Nikhil Patel, Meta
11:05 • Fp8 Training From Hopper To Blackwell - Luca Wehrstedt, Meta
11:35 • Portable High‑Performance LLM Serving: A Triton Backend for VLLM - Burkhard Ringlein, IBM Research & Jan van Lunteren, IBM
13:30 • PyTorch Symmetric Memory + NCCL Device APIs: A New Path Towards Multi-GPU Kernels - Ke Wen & Sylvain Jeaugey, NVIDIA
14:00 • Deploying PyTorch Models To the Browser and Beyond With Transformers.js - Joshua Lochner, Hugging Face
14:30 • Seamless Integration: Custom Kernels in the Torch.compile Stack Without Graphbreaks - Kshiteej Kalambarkar, Masaki Kozuki & Pawel Gadzinski, NVIDIA
15:25 • Bridging the Hardware Gap With Code Harnesses on the Hugging Face Kernels Hub - Ben Burtenshaw, Hugging Face
15:55 • From Gradients To Governance: Making PyTorch Lineage-Aware - Kateryna Romashko & Clodagh Walsh, Red Hat