PyTorch Conference Europe 2026: Full Schedule

7-8 April, 2025
Paris, France
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference Europe 2026 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in CEST (UTC/GMT +2). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

07:30 CEST

Registration & Badge Pick-Up

Tuesday April 7, 2026 07:30 - 18:00 CEST

Lobby

Tuesday April 7, 2026 07:30 - 18:00 CEST
Lobby

Breaks/Exhibits/Special Events

07:30 CEST

Community Expo

Tuesday April 7, 2026 07:30 - 18:35 CEST

Open Platform

Tuesday April 7, 2026 07:30 - 18:35 CEST
Open Platform

Breaks/Exhibits/Special Events

09:00 CEST

Keynote: Co-Evolution: How the Open Source Intelligence Stack Compounds - Mark Collier, Executive Director, PyTorch Foundation, General Manager, AI & Infrastructure, Linux Foundation

Tuesday April 7, 2026 09:00 - 09:10 CEST

Master Stage

Agentic coding systems have crossed a threshold from experimentation to measurable economic impact. Their rapid adoption reveals a deeper shift: modern AI capability emerges from the co-evolution of models, training frameworks, inference engines, reinforcement systems, hardware, and cloud infrastructure, with open source enabling the flow of code, research, and operational knowledge across the...
See More →

Speakers

Mark Collier

Executive Director, PyTorch Foundation, General Manager, AI & Infrastructure, The Linux Foundation

1. Mark Collier pdf

Tuesday April 7, 2026 09:00 - 09:10 CEST
Master Stage

Keynote Sessions

Audience Level Any
Slides Attached Yes

09:10 CEST

Keynote: PyTorch Updates - Edward Yang, Research Engineer, Meta

Tuesday April 7, 2026 09:10 - 09:30 CEST

Master Stage

Speakers

Edward Yang

Research Engineer, Meta

Edward Yang has worked on PyTorch at Meta since nearly the very beginning. Currently, he works on all aspects of PT2, but with a particular focus on dynamic shapes support across the stack.

2. Edward Yang pdf

Tuesday April 7, 2026 09:10 - 09:30 CEST
Master Stage

Keynote Sessions

Audience Level Any
Slides Attached Yes

09:35 CEST

Keynote: Community Led Open Source RL - Joe Spisak, VP of Product & Head of Open Source, Reflection AI

Tuesday April 7, 2026 09:35 - 09:45 CEST

Master Stage

Speakers

Joe Spisak

VP of Product & Head of Open Source, Reflection AI

Joe Spisak is Product Director for AI at Meta with leadership roles in PyTorch, Llama and FAIR research. A veteran of the AI space with over 10 years experience, Joe led product teams at Meta/Facebook, Google and Amazon where he focused on open source AI, building developer tools... Read More →

Tuesday April 7, 2026 09:35 - 09:45 CEST
Master Stage

Keynote Sessions

Audience Level Any

09:45 CEST

Sponsored Keynote: From One Node to Distributed Training and Inference. How the PyTorch Ecosystem Changed AI - Ramine Roane, Corporate Vice President of AI Product Management and Ecosystem Development, AMD

Tuesday April 7, 2026 09:45 - 09:50 CEST

Master Stage

PyTorch has evolved from a research framework into a distributed-first platform powering production AI at massive scale. As models grow to hundreds of billions of parameters, this talk explores the challenges of scaling inference across nodes and the emerging ecosystem from Monarch and TorchTitan to open, hardware-agnostic systems that makes it possible.

Speakers

Ramine Roane

Corporate Vice President of AI Product Management and Ecosystem Development, AMD

Ramine Roane is the Corporate Vice President of AI Product Management and ecosystem development at AMD, based in San Jose, California. Prior to this role, he served as Vice President of Data Center Acceleration within AMD’s Adaptive and Embedded Computing Group in 2022. Before the... Read More →

Tuesday April 7, 2026 09:45 - 09:50 CEST
Master Stage

Keynote Sessions

Audience Level Any

09:55 CEST

Keynote: Stream Everything - Moving from Request input to Streaming input - Patrick von Platen, Research Engineer, Mistral AI

Tuesday April 7, 2026 09:55 - 10:10 CEST

Master Stage

Speakers

Patrick von Platen

Research Engineer, Mistral AI

Patrick von Platen is a Research Engineer at Mistral AI, focussed on natural language processing and scalable AI systems. Currently, he contributes to vLLM, is a former core maintainer of Transformers, and created Diffusers.

5. Patrick von Platen pdf

Tuesday April 7, 2026 09:55 - 10:10 CEST
Master Stage

Keynote Sessions

Audience Level Any
Slides Attached Yes

10:10 CEST

Sponsored Keynote: Any [ Agent | Model | Accelerator | Cloud ]. Open Source AI Unlocks the World's Potential - Maryam Tahhan, Principal Engineer & Nicolò Lucchesi, Senior Machine Learning Engineer, Red Hat

Tuesday April 7, 2026 10:10 - 10:15 CEST

Master Stage

Red Hat is shaping an open future for AI, delivering on the promise of 'Any Agent, Any Model, Any Accelerator, Any Cloud.' Discover the community advancements contributed in the PyTorch Foundation that empower enterprises to rapidly enable, test, and seamlessly scale AI workloads across their choice of infrastructure

Speakers

Maryam Tahhan

Principal Engineer, Red Hat

Maryam is a Principal Engineer in Red Hat's Office of the CTO, where she focuses on standardising CPU inferencing performance evaluation to help effectively validate and scale ML workloads.

Nicolò Lucchesi

Senior Machine Learning Engineer, Red Hat

Nicolò is a Senior Machine Learning Engineer at Red Hat with a background in Deep Learning and Computer Vision. He works on Inference Optimization for vLLM, where he is a maintainer.

Tuesday April 7, 2026 10:10 - 10:15 CEST
Master Stage

Keynote Sessions

Audience Level Any

10:15 CEST

Keynote: The Unbearable Lightness of (Agentic) Evaluations - Besmira Nushi, Senior Manager, AI Research, NVIDIA

Tuesday April 7, 2026 10:15 - 10:25 CEST

Master Stage

The discipline of evaluating large language models underwent a major transformation with the rise of general AI capabilities. Today, the field is undergoing yet another challenging transformation following the groundbreaking improvements in agentic tasks, which expect models and systems to plan and take autonomous actions in the real world. Measuring how well models and systems perform in such...
See More →

Speakers

Besmira Nushi

Senior Manager - AI Research, NVIDIA

Besmira Nushi is a Senior AI Research Manager at NVIDIA in Zurich, where she leads research on LLM evaluation, model analysis and generalization, and real-world and agentic AI system measurements. Previously, she spent 7+ years at Microsoft Research advancing responsible AI, model... Read More →

Tuesday April 7, 2026 10:15 - 10:25 CEST
Master Stage

Keynote Sessions

Audience Level Any

10:30 CEST

Birds of A Feather: Engineering for the EU AI Act: What Should PyTorch Expose Natively? - Roy Saurabh, AffectLog

Tuesday April 7, 2026 10:30 - 11:00 CEST

Open Platform

The EU AI Act introduces concrete technical obligations for ML systems: traceability, risk management, monitoring, and auditability. Today, most of this burden is handled outside the ML framework—through ad-hoc tooling, documentation, or bespoke infrastructure. This Birds of a Feather session is an open, practitioner-driven discussion on a forward-looking question: What primitives, hooks, or...
See More →

Speakers

Roy Saurabh

Président, AffectLog

Roy Saurabh is Founder & CEO of AffectLog and an applied researcher in AI governance, privacy engineering, and accountable ML systems. He has worked with UNESCO, the European Commission, and national governments on operationalising trustworthy AI, and leads EU-funded projects focused... Read More →

Tuesday April 7, 2026 10:30 - 11:00 CEST
Open Platform

Birds of A Feather

Audience Level Any

10:30 CEST

Coffee Break

Tuesday April 7, 2026 10:30 - 11:00 CEST

Open Platform

Menu:
-Apple and pecan nut cake (Vegan, Vegetarian)
-Granola bar (Gluten Free, Vegetarian)
-Seasonal fruits (Vegan, GF, Vegetarian)
-Egg sandwich (Vegetarian)
-Dry fruits and dry grapes mix (Vegan, GF, Vegetarian)

Tuesday April 7, 2026 10:30 - 11:00 CEST
Open Platform

Breaks/Exhibits/Special Events

10:30 CEST

Meet the Developers of PyTorch Module Maintainers

Tuesday April 7, 2026 10:30 - 11:00 CEST

Open Platform

These sessions give participants an opportunity to meet some of the developers leading PyTorch to foster collaboration, gather feedback, and inspire contributions and collaboration .PyTorch core modules (e.g. torch.autograd, torch.optim, torch.nn) form the foundation for most AI research and development, either directly through PyTorch or indirectly via higher-level framework. The core libraries...
See More →

Speakers

Edward Yang

Research Engineer, Meta

Edward Yang has worked on PyTorch at Meta since nearly the very beginning. Currently, he works on all aspects of PT2, but with a particular focus on dynamic shapes support across the stack.

Alban Desmaison

Research Engineer, Meta

Driss Guessous

Machine Learning Engineer, Meta

I am currently a machine learning engineer working on core development of PyTorch. I received my Masters in Computer Science from the University of Illinois at Urbana-Champaign. I received a dual degree in Physics and Applied Mathematics from The Ohio State University. I also won... Read More →

Mergen Nachin

Software Engineer, Meta

Mergen Nachin is a Software Engineer specializing in creating rich AI experiences on low latency, high performance, and privacy-aware embedded systems. With a background in distributed systems, developer infrastructure, remote sensing, and localization, he brings a versatile skill... Read More →

Natalia Gimelshein

Software Engineer, Meta

Natalia Gimelshein is a software engineer at Meta. She is one of the pytorch leads, and works on GPU performance and support, including low precision, distributed and symmetric memory.

Jason Ansel

Research Scientist, Meta

Jason Ansel is a Research Scientist at Meta AI and a technical lead for PyTorch compilers. He started the TorchDynamo and TorchInductor projects, which bring flexible graph capture and a high performance compiler to PyTorch 2. He received a Ph.D. from MIT and has over 15 years of... Read More →

Tuesday April 7, 2026 10:30 - 11:00 CEST
Open Platform

Meet the Developers

Audience Level Any

11:00 CEST

Lightning Talk: Why Your Forecasting Transformer Isn’t Working (And How To Fix It in Python) - Rosheen Naeem, Open Climate Fix

Tuesday April 7, 2026 11:00 - 11:10 CEST

Central Room

Renewable energy is clean — but it’s also inherently variable. Solar PV generation can change dramatically within minutes due to cloud cover and weather conditions, making accurate short-term forecasts essential for grid stability, energy trading, and smart-home optimisation. Open Climate Fix builds open and high-impact forecasting tools to accelerate the transition to a low-carbon energy...
See More →

Speakers

Rosheen Naeem

Software Engineer, Miro

I am a Software Engineer at Miro and a community member at Open Climate Fix. I completed the Erasmus Mundus Master’s in Software Engineering for the Green Deal (SE4GD), a joint degree program across Vrije Universiteit Amsterdam (Netherlands), LUT University (Finland), and Universit... Read More →

PyTorch Conference GSoC Forecasting Transformer pdf

Tuesday April 7, 2026 11:00 - 11:10 CEST
Central Room

Applications & Case Studies

Audience Level Intermediate
Slides Attached Yes

11:00 CEST

Lightning Talk: Training Embedding Model Resiliently for Multimodal Model Inference Routing - Huamin Chen, Red Hat & Haichen Zhang, AMD

Tuesday April 7, 2026 11:00 - 11:10 CEST

Junior Stage

LLM systems increasingly rely on intelligent routing to balance cost, latency, and quality tradeoffs. The vLLM Semantic Router, a vLLM Ecosystem project, provides both semantic and performance level routing intelligence for Mixture-of-Multimodal Models (MoM) architectures, but its effectiveness depends on fast and accurate classifiers.This talk presents our end-to-end journey training...
See More →

Speakers

Huamin Chen

Technical Advisor, Microsoft

Dr. Huamin Chen is a passionate developer. He co-founded the Semantic Router project under vLLM community. His recent contributions to the CNCF ecosystem include Project Kepler, TAG Environmental Sustainability, and Cloud Native AI WG. He is also one of the founding members... Read More →

Haichen Zhang

Senior AI Software Engineer, AMD

Haichen is the Senior AI Engineer for AMD AI Group, specializing in accelerating training and inference for large language models, recommender systems, computer vision (CV), and natural language processing (NLP) tailored to internet customers. Before joining AMD, Haichen worked at... Read More →

Tuesday April 7, 2026 11:00 - 11:10 CEST
Junior Stage

Training Systems

Audience Level Intermediate

11:00 CEST

Helion 1.0: A High-Level DSL for Performance Portable Kernels - Oguz Ulgen, Meta

Tuesday April 7, 2026 11:00 - 11:25 CEST

Master Stage

ML practitioners increasingly author bespoke kernels, but achieving portable performance demands low-level expertise and repeated manual tuning for each accelerator generation and type. We introduce Helion, a Python-embedded DSL with a “PyTorch with tiles” programming model that preserves familiar PyTorch APIs while giving developers lower-level control over the generated kernels. Helion...
See More →

Speakers

Oguz Ulgen

Software Engineer, Meta

I'm a software engineer at Meta where I used to work on the Hack programming language and now work on PyTorch.

[PTC Paris 2026] Helion pdf

Tuesday April 7, 2026 11:00 - 11:25 CEST
Master Stage

Frameworks & Compilers

Audience Level Intermediate

11:00 CEST

Lights, Camera, Inference! Video Generation as a Service With VLLM-Omni - Ricardo Noriega, Red Hat & Doug Smith, Red Hat, Inc

Tuesday April 7, 2026 11:00 - 11:25 CEST

Founders Cafe

LLMs made for text generation as a service. What does it take to do the same for video? We built an experimental Video Generation as a Service stack using vLLM-Omni and the LTX-2 open weights video model to explore how far an open, multimodal stack can go toward production use. We’ll share what worked, what busted, and what it takes to treat generative video as a first-class workload. vLLM is...
See More →

Speakers

Doug Smith

Principal Software Engineer, Red Hat

Doug Smith is a Principal MLOps Engineer at Red Hat, where he works on the AI Inference Server team and contributes upstream to the vLLM project through its CI Special Interest Group. Recently, he's also been looking into contributions to vLLM-Omni. He’s spent years bridging telecom... Read More →

Ricardo Noriega

Principal SW Engineer, Red Hat

Ricardo is a Principal Software Engineer working at the Red Hat's Office of the CTO in the Emerging Technologies organization. Ricardo is currently focused on AI multimodality and researching the benefits of Small Language Models.
He is a former member of the Akraino TSC and PTL of the Kubernetes-Native-Infrastructure blueprint family, and contributor to Kubernetes, OpenStack, OpenDaylight and OPNFV... Read More →

Lights, Camera, Inference! Video Generation as a Service With VLLM Omni (1) pdf

Tuesday April 7, 2026 11:00 - 11:25 CEST
Founders Cafe

GenAI & Multimodal

Audience Level Any
Slides Attached Yes

11:15 CEST

Lightning Talk: Deep Learning in the Wild: Embedded PyTorch for Real-World Conservation Bioacoustics - Taraqur Rahman & Owen O'Donnell, OWL Integrations

Tuesday April 7, 2026 11:15 - 11:25 CEST

Central Room

Passive acoustic monitoring is a powerful tool for wildlife conservation, but deploying deep learning models in remote rainforest environments introduces strict constraints on power, memory, and compute. In this talk, we present an end-to-end PyTorch-based pipeline for detecting and analyzing the endangered three-wattled bellbird using embedded deep learning systems. We cover the full lifecycle...
See More →

Speakers

Owen O'Donnell

Embedded Systems and Machine Learning Engineer, OWL Integrations

Owen O'Donnell is a Machine Learning and Embedded Systems Engineer at OWL integrations. He works with training ML models to deploy in remote locations that will be running on resource constrained electronics. This introduces challenges such as needing smaller sized models and having... Read More →

Taraqur Rahman

Chief Data Scientist, OWL Integrations

Taraqur Rahman is Chief Data Scientist and Co-Founder at OWL Integrations and Organizer/Co-Founder of Biased Outliers, where he leads applied machine learning and data science initiatives with real-world impact. He combines deep technical expertise in Python with practical deployment... Read More →

Deep Learning in the Wild pdf

Tuesday April 7, 2026 11:15 - 11:25 CEST
Central Room

Applications & Case Studies

Audience Level Any
Slides Attached Yes

11:15 CEST

Lightning Talk: Flexible Deployment of PyTorch Models on MCU-Class Devices Using ExecuTorch - Robert Kalmar & Martin Pavella, NXP

Tuesday April 7, 2026 11:15 - 11:25 CEST

Junior Stage

ExecuTorch has recently matured into a production ready framework designed specifically for efficient edge deployment of PyTorch models. Its architecture supports a broad spectrum of hardware targets—from low power, bare metal or RTOS based microcontrollers (MCU) to higher performance Linux or Android based microprocessor platforms—while meeting the demanding constraints of memory, compute,...
See More →

Speakers

Robert Kalmar

Principal AI/ML Engineer at NXP Semiconductors, NXP Semiconductors

Robert Kalmar is a Principal Machine Learning Engineer at NXP Semiconductors. He received his master’s degree in machine learning and intelligent systems from Brno University of Technology. At NXP he focus on machine learning solution enablement for embedded and mobile devices... Read More →

Martin Pavella

ML SW Engineer, NXP Semiconductors

I hold a Master’s degree in Machine Learning from the Brno University of Technology, graduating with distinction at both bachelor’s and master’s levels. I am a mid-level AI/ML Software Engineer at NXP Semiconductors with 2.5+ years of experience. I won the 2025 iGEM overgraduate... Read More →

Flexible Deployment of PyTorch Models on MCU Class Devices pdf

Tuesday April 7, 2026 11:15 - 11:25 CEST
Junior Stage

Inference & Production

Audience Level Intermediate
Slides Attached Yes

11:30 CEST

Lightning Talk: Coding Agents for Compiler Construction: Beyond the AI Assistant Paradigm - Reza Rahimi, yasp.ai & Stefan Krassin, yasp

Tuesday April 7, 2026 11:30 - 11:40 CEST

Founders Cafe

Modern ML compilers follow a familiar pattern: a frontend lowers models into an intermediate representation, while a backend applies graph and kernel optimizations before generating code for target accelerators. PyTorch provides strong foundations through nn.Module, FX, and graph capture, but implementing optimized backends remains challenging due to hardware diversity and kernel-level complexity....
See More →

Speakers

Reza Rahimi

CTO, yasp

Reza Rahimi is a seasoned technologist with a strong background in accelerating engineering software and scaling machine learning systems. With experience leading teams across embedded AI, compiler design, and model optimization, he now serves as CTO of yasp, where he is pioneering... Read More →

Stefan Krassin

CEO, yasp.ai

With a background in electrical engineering and a career spanning embedded systems to executive leadership, he combines technical expertise with a vision for scale. After 10+ years of leading companies to outstanding growth, he co-founded yasp in 2023. His mission is to eliminate... Read More →

260403 pytorch conf pdf

Tuesday April 7, 2026 11:30 - 11:40 CEST
Founders Cafe

Agents & Interop

Audience Level Intermediate
Slides Attached Yes

11:30 CEST

Lightning Talk: How DeepInverse Is Solving Imaging in Science and Healthcare With PyTorch - Andrew Wang, DeepInverse; Minh Hai Nguyen, Université de Toulouse

Tuesday April 7, 2026 11:30 - 11:40 CEST

Central Room

Deep learning has revolutionised imaging, a foundation of science and healthcare. DeepInverse is the PyTorch library for solving imaging problems, unifying deep learning methods (e.g. diffusion models), physics (medical, optics) and modern tooling. In this talk, we’ll show how the PyTorch community can get involved in this exciting yet accessible application of open-source AI. AI methods in...
See More →

Speakers

Andrew Wang

CTO & Co-founder, Blur Labs

Andrew is a lead developer of DeepInverse as well as the CTO & co-founder of Blur Labs, a startup based in Paris building AI models for imaging. Andrew did his PhD at the University of Edinburgh in magnetic resonance image reconstruction.

Minh Hai Nguyen

PhD candidate, Toulouse University

deepinv blurlabs upload pdf

Tuesday April 7, 2026 11:30 - 11:40 CEST
Central Room

Applications & Case Studies

Audience Level Any
Slides Attached Yes

11:30 CEST

Tour De Force: LLM Inference Optimization From Simple To Sophisticated - Christin Pohl, Microsoft

Tuesday April 7, 2026 11:30 - 11:55 CEST

Master Stage

Making your GPUs go brrr is complex. Efficient LLM inference requires navigating a maze of optimization techniques each with different trade-offs. This session provides a practical journey through inference optimizations, clearly categorized by implementation effort. We'll explore techniques across three levels: - Model choices (start here): Model selection, quantization, smart routing -...
See More →

Speakers

Christin Pohl

Global Black Belt Solution Engineer AI Infrastructure, Microsoft

Christin Pohl is a Global Black Belt Solution Engineer for AI Infrastructure at Microsoft (Switzerland), now in her third year. After building her first chatbot in 2018 and 5+ years at SAP, she helps enterprises worldwide choose the right GPU, run LLM training and inference end-to-end... Read More →

Tuesday April 7, 2026 11:30 - 11:55 CEST
Master Stage

Inference & Production

Audience Level Intermediate

11:30 CEST

Why Classic IAM Collapses for Agents: Rethinking IAM for Agentic Systems - Parul Singh, Red Hat

Tuesday April 7, 2026 11:30 - 11:55 CEST

Junior Stage

Autonomous AI agents increasingly reason, plan and act across tools, services and organizational boundaries. In these environments, traditional Identity and Access Management models begin to fail. Agents are not users and they are not static services. They act on behalf of others, change context during execution and operate with different levels of autonomy and risk. This talk examines why...
See More →

Speakers

Parul Singh

Principal Software Engineer, Red Hat

Parul is a Principal Software Engineer in Red Hat's Office of the CTO, working on agentic systems and security. Her work focuses on trust, identity, and observability for autonomous AI agents, including delegation, provenance, and zero trust architectures for agentic workflows. She... Read More →

slides pdf

Tuesday April 7, 2026 11:30 - 11:55 CEST
Junior Stage

Security & Privacy

Audience Level Beginner
Slides Attached Yes

11:45 CEST

Lightning Talk: ExecuTorch on Microcontrollers: Deploying PyTorch To the Smallest Edge - RJ Ascani & Matthias Cremon, Meta

Tuesday April 7, 2026 11:45 - 11:55 CEST

Central Room

ExecuTorch extends PyTorch's reach to the most resource-constrained devices: microcontrollers, DSPs, and specialized neural processing units powering always-on sensors, wearables, and embedded systems. In this talk, we'll share the current state and roadmap for running ExecuTorch on platforms where every kilobyte of memory and milliwatt of power matters. What you'll learn: - How ExecuTorch's...
See More →

Speakers

Matthias Cremon

Software Engineering Manager, Meta

Matthias Cremon is a Software Engineering Manager at Meta in the Silicon AI Software Team, working on AI compilers for various edge devices. He focuses on the frontend, graph level optimization side, as well as the integration of low-level, vendor specific implementations to run on... Read More →

RJ Ascani

Software Engineer, Meta

RJ Ascani is an embedded software engineer on Meta’s PyTorch Edge team, focusing on advancing ExecuTorch for microcontroller platforms.

ExecuTorch on Microcontrollers pdf

Tuesday April 7, 2026 11:45 - 11:55 CEST
Central Room

Inference & Production

Audience Level Any
Slides Attached Yes

11:45 CEST

Lightning Talk: TorchJD: Jacobian Descent in PyTorch - Pierre Quinton, EPFL & Valérian Rey, Simplex Lab

Tuesday April 7, 2026 11:45 - 11:55 CEST

Founders Cafe

Jacobian descent (JD) is an extension of gradient descent supporting the optimization of vector-valued functions. This algorithm can be used to train neural networks with multiple loss functions (e.g. multi-task learning). JD iteratively updates the parameters of the model using the Jacobian matrix of the vector of losses (the matrix stacking each individual loss' gradient). To support and extend...
See More →

Speakers

Pierre Quinton

Teacher, EPFL

PhD in Information Theory and Master in Data Science, specializing in fundamental math and multi-objective optimization (MOO). I am the co-author of TorchJD, a PyTorch library for Jacobian Descent developed with Valerian, currently at ~300 GitHub stars. My work aims to translate complex... Read More →

Valérian Rey

Research Engineer, Simplex Lab

I graduated from EPFL with a MSc in Data Science in 2021. Since then, I worked as a Data Scientist as Withings, and I worked on Jacobian descent, initially as a side-project, but now as a full-time occupation. I now spend most of my time developing and maintaining TorchJD, and I love... Read More →

TorchJD pdf

Tuesday April 7, 2026 11:45 - 11:55 CEST
Founders Cafe

Training Systems

Audience Level Intermediate
Slides Attached Yes

12:00 CEST

Lightning Talk: Ethical, Privacy and Sustainability Considerations in PyTorch Systems - Paula Mesa Macias, Pau&Company

Tuesday April 7, 2026 12:00 - 12:10 CEST

Founders Cafe

PyTorch models are part of larger systems that handle data, logs, APIs and other services. Ethical, privacy, security and environmental considerations appear not only around the AI itself, but across the whole system. Using the Ethical Software Framework and the Ethical IT Audit, this session explores practical ways to think about these issues in real workflows. It highlights situations where...
See More →

Speakers

Paula Mesa Macias

Founder and Ethical Technology Consultant, Pau&Company

Founder of Pau&Company (https://pau.company/), which offers Ethical IT Audits (https://pau.company/ethical-it-audit/) based on the Ethical Software Framework (https://pau.company/ethical-software-framework/), Paula focuses on ethical considerations in technology. Through Pau&Company... Read More →

Tuesday April 7, 2026 12:00 - 12:10 CEST
Founders Cafe

Responsible AI & Compliance

Audience Level Any

12:00 CEST

Lightning Talk: Bringing Google’s Colossus to PyTorch: Rapid Storage via fsspec to Keep GPUs Busy - Ankita Luthra & Trinadh Kotturu, Google

Tuesday April 7, 2026 12:00 - 12:10 CEST

Master Stage

As PyTorch models scale to billions of parameters, the bottleneck has quietly shifted from compute to storage. Modern GPU clusters often sit idle, "starving" for data while waiting on legacy REST-based protocols. This talk introduces Rapid Storage: a fundamental architectural shift bringing Google’s Colossus stateful protocol (that powers many Google’s products) to PyTorch via fsspec , a...
See More →

Speakers

Ankita Luthra

Senior Software Engineer, Google

Ankita Luthra is a Software Developer at Google, focused on AI/ML infrastructure and scalable data pipelines. Her work with open-source tools like fsspec(gcsfs) and gcsfuse improves how frameworks such as PyTorch/ JAX efficiently access data from Google Cloud Storage.

Trinadh Kotturu

Senior Product Manager, Google

Trinadh Kotturu is a Senior Product Manager specializing in AI/ML and analytics client strategy at Google. An alumnus of IIM Bangalore with 12 years of experience, he has a proven track record of shipping v1 products and scaling them into robust platform services. His expertise spans large-scale distributed storage systems, autonomous driving, and system resiliency... Read More →

Bringing Rapid Buckets to Pytorch pdf

Tuesday April 7, 2026 12:00 - 12:10 CEST
Master Stage

Training Systems

Audience Level Any
Slides Attached Yes

12:00 CEST

Parameterized CUDA Graph Launch in PyTorch: CUDA Graphs Without the Pain - Daniel Galvez, NVIDIA

Tuesday April 7, 2026 12:00 - 12:25 CEST

Junior Stage

Modern GPUs are fast enough that CPU kernel launch overhead has become a real bottleneck. CUDA Graphs can eliminate this overhead, but in practice they are hard to use and easy to get wrong. When CUDA Graph capture fails, PyTorch users typically face two choices: fix the code that breaks capture—often with limited guidance—or capture only parts of the workload. Partial capture comes with...
See More →

Speakers

Daniel Galvez

Manager, NVIDIA

Daniel Galvez is an AI developer technology engineer working on speech recognition and natural language processing inference and training. He has contributed to software like PyTorch, NeMo, Megatron, ESPNet, vLLM, and TRT-LLM. He is currently working on reducing CPU overheads in CUDA... Read More →

Tuesday April 7, 2026 12:00 - 12:25 CEST
Junior Stage

Frameworks & Compilers

Audience Level Advanced

12:00 CEST

Write Once, Run Everywhere with Pytorch Transformers - Pedro Cuenca, Hugging Face

Tuesday April 7, 2026 12:00 - 12:25 CEST

Central Room

The Hugging Face transformers library is built on pure PyTorch and can be succinctly described as a model-definition framework. It provides an unified, familiar, clear and concise interface to multiple machine learning architectures across modalities.Serving and inference optimizations are not its focus.However, transformers model definitions become the de-facto reference implementations multiple...
See More →

Speakers

Pedro Cuenca

ML Engineer, Hugging Face

Pedro Cuenca is a machine learning engineer at Hugging Face, working in developer advocacy and on-device ML. He has 20+ years of software development experience across internet applications and iOS. He worked on the technology behind Camera+, an iPhone app using custom ML for photography... Read More →

transformers write once run everywhere pdf

Tuesday April 7, 2026 12:00 - 12:25 CEST
Central Room

Inference & Production

Audience Level Any

12:15 CEST

Lightning Talk: FlexAttention + FlashAttention-4: Fast and Flexible - Driss Guessous, Meta

Tuesday April 7, 2026 12:15 - 12:25 CEST

Master Stage

FlexAttention democratized attention research by letting researchers prototype custom attention variants in PyTorch without hand-written CUDA. Over 1,000 repos have adopted it, and dozens of papers cite it. But flexibility came at a cost: FlexAttention achieved only ~60% of FlashAttention-3's throughput on Hopper, and the gap widened dramatically on Blackwell GPUs. We bridged this gap by...
See More →

Speakers

Driss Guessous

Machine Learning Engineer, Meta

Tuesday April 7, 2026 12:15 - 12:25 CEST
Master Stage

Frameworks & Compilers

Audience Level Advanced

12:25 CEST

Attendee Lunch

Tuesday April 7, 2026 12:25 - 13:55 CEST

Open Platform

Menu | Boxed Lunches:Vegan: (Vegetarian)-Marocaintaboule-Indian vegetable wrap with sesame oil and tandoori spices-Chocolate Chip CookieGluten-Free: -Bowl Niçoise salad (350 g)-Potatoes, green beans, cherry tomatoes, tuna, black olives, iceberg lettuce,eggs, chopped red onions-Chocolate cookieClassic:-Bird's tongue pasta salad with baby vegetables (Vegetarian)Round baguette sandwich with sliced...
See More →

Tuesday April 7, 2026 12:25 - 13:55 CEST
Open Platform

Breaks/Exhibits/Special Events

12:25 CEST

Women & Non-Binary in PyTorch Lunch

Tuesday April 7, 2026 12:25 - 13:55 CEST

Biblioteca Room at La Felicità

We’d like to invite all attendees who identify as women or non-binary to join each other for a networking lunch at the event. We will begin with a brief introduction and then attendees will be free to enjoy lunch and mingle with one another. All attendees must identify as a woman or non-binary and must be registered for the conference to attend.Menu-Burrata with basil pesto (Vegetarian, Gluten...
See More →

Tuesday April 7, 2026 12:25 - 13:55 CEST
Biblioteca Room at La Felicità 5 Parv. Alan Turing, 75013 Paris, France

Breaks/Exhibits/Special Events

13:45 CEST

Lightning Talk: From Pretrained To Personal: Privacy-First Fine-Tuning on AI PCs - Daniel Holanda Noronha & Iswarya Alex, AMD

Tuesday April 7, 2026 13:45 - 13:55 CEST

Founders Cafe

Pytorch on AI PCs crossed a threshold: local hardware can now support meaningful model fine-tuning, not just inference. This unlocks a new class of enterprise workflows where sensitive data never leaves the device, yet models can still be personalized and adapted using PyTorch. In this session, we’ll show how to design on-device fine-tuning pipelines for AI PCs, focusing on enterprise scenarios...
See More →

Speakers

Daniel Holanda

Solutions Architect & ML Engineer, AMD

Daniel is a Sr. ML Engineer at AMD, specializing in local AI. He leads the development of local fine-tuning workflows for AI PCs and co-leads several open-source projects where he designs production-grade LLM/VLM tooling to accelerate the AI development lifecycle.

Previously, he was a Machine Learning Engineer at Groq and a contributor to Microsoft’s Project Brainwave. Daniel holds a PhD in AI understanding and hardware architecture from UBC... Read More →

Iswarya Alex

Iswarya Alex, AMD

I am an ML Engineer at AMD focused on enabling high-performance on-device AI experiences. I work on optimizing and deploying models on AMD's Ryzen AI powered devices with GPUs and NPUs efficiently

Pytorch Conference 2026 From Pretrained to Personal pdf

Tuesday April 7, 2026 13:45 - 13:55 CEST
Founders Cafe

Security & Privacy

Audience Level Intermediate
Slides Attached Yes

13:45 CEST

Bringing ExecuTorch To the Next Frontiers of Edge AI - Mergen Nachin, Meta

Tuesday April 7, 2026 13:45 - 14:10 CEST

Master Stage

Since the General Availability release of ExecuTorch 1.0 in October 2025, our team has continued to advance the state of the on-device AI software stack. In this talk, we will share our upcoming roadmap and present demos that highlight ExecuTorch’s deployment across the next frontiers, such as AI PCs, robotics, TinyML devices, and the integration of AI agents to improve productivity for...
See More →

Speakers

Mergen Nachin

Software Engineer, Meta

Bringing ExecuTorch to the Next Frontiers of Edge AI pdf

Tuesday April 7, 2026 13:45 - 14:10 CEST
Master Stage

Applications & Case Studies

Audience Level Intermediate
Slides Attached Yes

13:45 CEST

Teaching PyTorch To Read Your Worst PDFs With Docling - Mingxuan Zhao & Peter Staar, IBM & Carol Chen, Red Hat

Tuesday April 7, 2026 13:45 - 14:10 CEST

Junior Stage

Building production RAG pipelines starts with a problem most teams underestimate: getting clean, structured data out of real-world documents. PDFs lose table structure, figures get separated from captions, and multi-column layouts become unreadable. Before your PyTorch models even see your data, crucial information is already lost. Docling is an open-source, MIT-licensed document parsing library...
See More →

Speakers

Carol Chen

Principal AI Community Architect, Red Hat

Carol Chen is a Community Architect at Red Hat, having led several upstream communities including InstructLab, Ansible and ManageIQ. She has been actively involved in open source communities while working for Jolla and Nokia previously. In addition, she also has experiences in software... Read More →

Mingxuan Zhao

Software Developer/Developer Advocate, IBM

Ming Zhao is an open source developer and Developer Advocate at IBM Research, where he helps IBM leverage open technologies while building impactful tools and growing vibrant open-source communities. He’s passionate about making open tech accessible to all and ensuring developers... Read More →

Peter Staar

IBM

Tuesday April 7, 2026 13:45 - 14:10 CEST
Junior Stage

GenAI & Multimodal

Audience Level Intermediate

13:45 CEST

Why WideEP Inference Needs Data-Parallel-Aware Scheduling - Maroon Ayoub, IBM; Tyler Michael Smith, Red Hat

Tuesday April 7, 2026 13:45 - 14:10 CEST

Central Room

WideEP—wide expert parallelism fails not because experts are expensive, but because routing ignores where state already lives. In PyTorch LLM serving with vLLM, WideEP fans tokens across many experts while KV caches accumulate unevenly across data-parallel replicas. When routing is unaware of KV placement and per-replica load, requests land on replicas that cannot reuse cache or make progress...
See More →

Speakers

Maroon Ayoub

Research Scientist & Architect, IBM Research

Maroon Ayoub is a systems engineer at IBM Research focused on distributed AI infrastructure. He co-leads development of llm-d and specializes in scaling LLM inference with Kubernetes-native architectures, performance efficiency, and open source integrations.

Tyler Michael Smith

Chief Architect - Inference Engineering, Red Hat

Tyler received a PhD in Computer Science at The University of Texas at Austin, studying high performance dense linear algebra - microkernels, parallelism, and theoretical lower bounds on data movement.. After a postdoc at ETH Zürich, he joined Neural Magic, first working on a graph... Read More →

[2026 04 07] PyTorch WideEP pptx

Tuesday April 7, 2026 13:45 - 14:10 CEST
Central Room

Inference & Production

Audience Level Any

14:15 CEST

Lightning Talk: Accelerating On-Device ML Inference With ExecuTorch and Arm SME2 - Jason Zhu, Arm

Tuesday April 7, 2026 14:15 - 14:25 CEST

Master Stage

As on-device AI workloads grow in complexity, achieving low-latency inference within mobile power constraints remains a central challenge. We examine how ExecuTorch, combined with Arm’s Scalable Matrix Extension 2 (SME2), enables efficient CPU deployments of production AI workloads. We present a case study of SqueezeSAM, a segmentation model deployed in real-world mobile applications. Using...
See More →

Speakers

Jason Zhihuai Zhu

Senior Principal Engineer, Arm

Jason Zhu is a Senior Principal Engineer at Arm focused on hardware and software co-optimization for AI systems. With a background in quantum physics and experience spanning AI research and product engineering across major technology companies, he works across the full execution stack... Read More →

On device ML SME2 Pytorch Paris 2026 final pdf

Tuesday April 7, 2026 14:15 - 14:25 CEST
Master Stage

Inference & Production

Audience Level Any
Slides Attached Yes

14:15 CEST

Sponsored Session: TorchTPU: Expanding TPU Programmability to Pytorch - Kat Ko & Claudio Basile, Google; Jana van Greunen, Meta

Tuesday April 7, 2026 14:15 - 14:40 CEST

Founders Cafe

Google Tensor Processing Units (TPUs) are designed for ML at massive scale, offering significant benefits in performance, energy, and cost. While TPUs have historically been associated with the TensorFlow and JAX ecosystems, we introduce TorchTPU: a new Google effort to expand TPU programmability to PyTorch.This talk charts TorchTPU’s evolution, from the initial RFC to establishing a native,...
See More →

Speakers

Jana van Greunen

Director of PyTorch Engineering, Meta

Jana van Greunen is the Director of PyTorch Engineering at Meta, where she leads efforts to ensure PyTorch remains the leading AI/ML framework for researchers and developers worldwide. With deep expertise in distributed systems, large-scale infrastructure, and over 15 years of experience... Read More →

Kat Ko

Senior Eng Manager, Google

Kat Ko is a Senior Engineering Manager at Google and a lead on TorchTPU, where she drives the integration of PyTorch with TPU technology to enable high-performance computing at scale. An EECS graduate of UC Berkeley, she brings over 15 years of experience building large-scale systems... Read More →

Claudio Basile

Software Engineer, Google

Claudio Basile is a Google Software Engineer and the co-founder and technical lead of TorchTPU. During his tenure at Google, he also authored LiteRT, the company’s new on-device ML framework. With a Ph.D. in ECE from UIUC and over 15 years of experience spanning machine learning... Read More →

TorchTPU PyTorch Paris '26 Google Slides pdf

Tuesday April 7, 2026 14:15 - 14:40 CEST
Founders Cafe

Frameworks & Compilers

Slides Attached Yes

14:15 CEST

The Token Slice: Implementing Preemptive Scheduling Via Chunked Decoding - Maroon Ayoub, IBM & Kellen Swain, Google

Tuesday April 7, 2026 14:15 - 14:40 CEST

Central Room

Production LLM serving faces a critical trade-off: while continuous batching maximizes throughput, it often sacrifices SLAs due to Head-of-Line (HoL) blocking. When long-context requests hijack the engine, tail latencies spike. Without fine-grained preemption, guaranteeing priority or fairness remains nearly impossible. We propose a solution: Chunked Decoding. By treating a fixed number of tokens...
See More →

Speakers

Maroon Ayoub

Research Scientist & Architect, IBM Research

Kellen Swain

Senior Software Engineer, Google

Kellen is a Senior Engineer at Google, and is a maintainer of both the llm-d and Inference Gateway projects.

Token Slice pptx

Tuesday April 7, 2026 14:15 - 14:40 CEST
Central Room

Inference & Production

Audience Level Intermediate

14:30 CEST

Lightning Talk: Combo Kernels: Horizontal Fusion Optimization in Torch.compile - Karthick Panner Selvam, & Elias Ellison, Meta

Tuesday April 7, 2026 14:30 - 14:40 CEST

Master Stage

Combo kernels are a compiler optimization in PyTorch Inductor that horizontally fuses multiple independent operations into a single Triton kernel launch, reducing GPU kernel launch overhead and improving memory locality. The Problem: Models generate many small, independent operations like weight preprocessing and tensor copies. Each launch incurs overhead. For models with many such operations,...
See More →

Speakers

Elias Ellison

Software Engineer, Meta

Elias has been working on the PyTorch team for four years, most recently on the torch.compile stack

Karthick Panner Selvam

Software Engineer, Meta

Karthick Panner Selvam is a SWE at Meta Superintelligence Lab, working on the PyTorch compiler team to enhance performance and scalability for large models. He earned his PhD in Machine for Systems at the University of Luxembourg, collaborating with Google DeepMind, ECMWF, and Frontier... Read More →

Combo Kernels Horizontal fusion optimization in torch.compile pdf

Tuesday April 7, 2026 14:30 - 14:40 CEST
Master Stage

Frameworks & Compilers

Audience Level Any
Slides Attached Yes

14:45 CEST

Lightning Talk: Implementing Single-Dim Strategies With Sharding Validator - Anshul Sinha, Meta

Tuesday April 7, 2026 14:45 - 14:55 CEST

Founders Cafe

DTensor sharding propagation is a major bottleneck to full operator coverage: adding or fixing an op strategy is complex, bug‑prone, and gaps often surface as unexpected resharding and extra collectives. A key source of complexity is that today’s rules conflate (1) semantic correctness—valid input/output sharding combinations for an operator—with (2) search‑space pruning to avoid...
See More →

Speakers

Anshul Sinha

Software Engineer, Meta

I graduated from the University of Michigan with a B.S in Computer Science in December 2024. I joined Meta's PyTorch Distributed as a SWE in June 2025.

Single Dim Strategies PyTorch Conference pdf

Tuesday April 7, 2026 14:45 - 14:55 CEST
Founders Cafe

Frameworks & Compilers

Audience Level Intermediate
Slides Attached Yes

14:45 CEST

Brevitas Quantization Library - Pablo Monteagudo Lago, AMD

Tuesday April 7, 2026 14:45 - 15:10 CEST

Junior Stage

Brevitas is an open‑source PyTorch library from AMD designed to support the research of state‑of‑the‑art quantization methods, including Qronos (ICLR 2026) and MixQuant (arXiv). Built for flexibility and composability, it offers modular components for exploring reduced‑precision data paths and accuracy‑preserving techniques. As generative models scale, post‑training quantization...
See More →

Speakers

Pablo Monteagudo Lago

Research Scientist, AMD

Pablo Monteagudo is a research scientist in AMD Research and Advanced Development, based in Dublin. He specialises in co-design of neural networks and accelerators, in particular, working on topics involving neural network quantization, sparsity and accelerator design.

brevitas pytorchcon eu26 pdf

Tuesday April 7, 2026 14:45 - 15:10 CEST
Junior Stage

Frameworks & Compilers

Audience Level Intermediate
Slides Attached Yes

14:45 CEST

Model-Changing Transforms With Torch.compile - Thomas Viehmann, Lightning AI

Tuesday April 7, 2026 14:45 - 15:10 CEST

Master Stage

torch.compile is the goto mechanism to increase performance of PyTorch models of all shapes and forms. While it is widely understood how to change the computation by manipulating the FX trace representation, it becomes a much more general tool by also transforming model and input expectations (the guards): This enables model-changing transformations like quantization and distributed without...
See More →

Speakers

Thomas Viehmann

Thunder, Lightning AI

Thomas Viehmann does PyTorch and Optimization at Lightning AI, PyTorch contributor since 2017, founded MathInf GmbH in 2018, co-authored of “Deep Learning with PyTorch” in 2020.

Tuesday April 7, 2026 14:45 - 15:10 CEST
Master Stage

Frameworks & Compilers

Audience Level Advanced

14:45 CEST

The Science and Practice of Open and Scalable LLM Evaluations - Grzegorz Chlebus, NVIDIA

Tuesday April 7, 2026 14:45 - 15:10 CEST

Central Room

Rapid advances in AI have expanded the range of capabilities required for successful real-world deployment. Understanding where we are in this multi-dimensional frontier is essential for accelerating innovation through effective quality assurance. Rigorous evaluation is increasingly difficult to scale as development requires testing many checkpoints across numerous benchmarks. Model comparison is...
See More →

Speakers

Grzegorz Chlebus

Manager R&D, NVIDIA

Grzegorz Chlebus is a Manager at Frontier Model Evaluation at NVIDIA, where he leads tooling and infrastructure efforts for evaluating frontier AI models. He holds a PhD in Medical Sciences from Radboud University Nijmegen, focused on deep learning-based medical image segmentation... Read More →

GChlebus The Science and Practice of Open and Scalable LLM Evaluations pdf

Tuesday April 7, 2026 14:45 - 15:10 CEST
Central Room

GenAI & Multimodal

Audience Level Intermediate
Slides Attached Yes

15:00 CEST

Lightning Talk: Jigsaw: Domain and Tensor Parallelism for High-Resolution Input Training - Deifilia Kieckhefen, Karlsruhe Institute of Technology

Tuesday April 7, 2026 15:00 - 15:10 CEST

Founders Cafe

Distributed neural network training frameworks typically optimize for specific architectures while minimizing communication overhead. Transformer layers can be efficiently parallelized, but other operations such as convolutions often remain inefficient. This creates bottlenecks for complex model architectures. Moreover, existing tensor parallelism strategies typically replicate input data across...
See More →

Speakers

Deifilia Kieckhefen

Doctoral Researcher, Karlsruhe Institute of Technology

Deifilia Kieckhefen is a doctoral researcher at the Karlsruhe Institute of Technology. She works on scalable and distributed training of neural network architectures.

PyTorchConf jigsaw Kieckhefen pdf

Tuesday April 7, 2026 15:00 - 15:10 CEST
Founders Cafe

Training Systems

Audience Level Any
Slides Attached Yes

15:10 CEST

Coffee Break

Tuesday April 7, 2026 15:10 - 15:40 CEST

Open Platform

Menu:
-Chocolate cake
-Red frutis yogurt (and its wooden spoon) (Gluten Free, Vegetarian)
-Seasonal fruits (GF, Vegan)
-Hummus and vegetable brioche roll
-Dry fruits and dry grapes mix
-Chocolate Cookie (GF, Vegan)

Tuesday April 7, 2026 15:10 - 15:40 CEST
Open Platform

Breaks/Exhibits/Special Events

15:10 CEST

Meet the Developers of Helion

Tuesday April 7, 2026 15:10 - 15:40 CEST

Open Platform

This session offers a unique opportunity to connect with the core developers of Helion (https://github.com/pytorch/helion)—ask questions, share feedback, and explore collaboration opportunities with the team. About Helion At PTC 2025, we launched Helion (in Beta), a PyTorch-native kernel authoring DSL designed to deliver portable performance across heterogeneous hardware. Since then, Helion has...
See More →

Speakers

Will Feng

Software Engineer, Meta

Will Feng is a Software Engineer in PyTorch Compiler team at Meta. He has been working in PyTorch core and ecosystem for the past 7 years. He is now working on and most excited about torch.compile for distributed training performance.

Oguz Ulgen

Software Engineer, Meta

I'm a software engineer at Meta where I used to work on the Hack programming language and now work on PyTorch.

Jason Ansel

Research Scientist, Meta

Markus Hoehnerbach

15:40 CEST

Lightning Talk: Graph Based Pipeline Parallelism - Sanket Purandare, Meta & Simon Fan, Meta PyTorch

Tuesday April 7, 2026 15:40 - 15:50 CEST

Master Stage

Pipeline parallelism is vital for large models, but advanced schedules for SOTA LLMs are difficult to express in current PyTorch. MoE communication dominates the critical path, making latency hiding essential. Leading systems use fw-bw overlapping; fw-fw and bw-bw overlapping further boost throughput. Schedules like ZeroBubbleV and DualPipeV rely on dI-dW backward splitting for fine-grained...
See More →

Speakers

Simon Fan

Software Engineer, Meta

I work on the PyTorch team at Meta, focusing on distributed training efficiency.

Sanket Purandare

Research Engineer, Meta

Currently, Sanket serves as a Research Engineer at Meta's SuperIntelligence Lab, in PyTorch Distributed and Compiler team. He specializes in performance optimization of large scale training of LLMs based on Mixture of Experts architectures.

Prior to this he obtained his PhD in A... Read More →

GraphPP PTC Paris pdf

Tuesday April 7, 2026 15:40 - 15:50 CEST
Master Stage

Frameworks & Compilers

Audience Level Intermediate
Slides Attached Yes

15:40 CEST

Lightning Talk: Cross-Region Model Serving: PyTorch Inference, Observability & LLMOps - Suraj Muraleedharan, Amazon Web Services

Tuesday April 7, 2026 15:40 - 15:50 CEST

Founders Cafe

As PyTorch models move to production, organizations face a critical challenge: deploying, monitoring, and operating inference at scale across multiple regions. Single-region serving is well-understood, but multi-region LLMOps—model distribution, observability, failover, and cost management—remains ad-hoc and challenging for multiple customers. This session presents production-tested...
See More →

Speakers

Suraj Muraleedharan

Principal Platform Engineer, Amazon Web Services

Principal Engineer driving technical strategy and building mission-critical foundational platforms for AI, HPC, and distributed systems, bridging the gap between infrastructure, AI research, and product organizations.

PyTorchCon EU26 Cross Region Model Serving pdf

Tuesday April 7, 2026 15:40 - 15:50 CEST
Founders Cafe

Inference & Production

Audience Level Advanced
Slides Attached Yes

15:40 CEST

Enabling State-of-the-art Asynchronous Execution in Torch.compile With CUDA Streams - Michael Lazos, Meta

Tuesday April 7, 2026 15:40 - 16:05 CEST

Central Room

CUDA streams are a widely-used method for parallelizing GPU computation on NVIDIA GPUs. They have long been requested by our users and enable multiple key capabilities - overlapping communication and compute kernels, training on multiple batches in parallel and parallelizing kernels, all of which are needed for achieving SOTA training performance. Another key capability is activation offloading -...
See More →

Speakers

Michael Lazos

Software Engineer, Meta

Michael Lazos is a software engineer at Meta where he contributes to torch.compile. His expertise spans both graph extraction with TorchDynamo and generating optimized kernels with the backend compiler TorchInductor. Previously, he was at Microsoft contributing to project Brainwave... Read More →

SOTA CUDA Streams PTC EU 26 pdf

Tuesday April 7, 2026 15:40 - 16:05 CEST
Central Room

Frameworks & Compilers

Audience Level Intermediate
Slides Attached Yes

15:40 CEST

torch.compile and Diffusers: A Hands-On Guide to Peak Performance - Sayak Paul, Hugging Face

Tuesday April 7, 2026 15:40 - 16:05 CEST

Junior Stage

This session shows how to use torch.compile with the Diffusers library to speed up diffusion models like Flux-1-Dev.You'll learn practical techniques for both model authors and users. For authors, we cover how to make models compiler-friendly using fullgraph=True. For users, we explain regional compilation (which cuts compile time by 7x while keeping the same runtime gains) and how to avoid...
See More →

Speakers

Sayak Paul

Research Engineer, Hugging Face

I am a Research Engineer at Hugging Face, working on image and video generation. My day-to-day includes maintaining the Diffusers library, training, and babysitting models. When I am not working, I can be found either watching Suits for the n-th time or playing the guitar.

Tuesday April 7, 2026 15:40 - 16:05 CEST
Junior Stage

GenAI & Multimodal

Audience Level Intermediate

15:55 CEST

Lightning Talk: Running ExecuTorch Applications With Silicon Acceleration, in Ultra-low Power - George Gekov, Arm; Aki Makkonen, Alif Semiconductor

Tuesday April 7, 2026 15:55 - 16:05 CEST

Founders Cafe

Efficient deployment of ML models on low-power embedded systems has been a significant challenge for a number of years. At the same time, these embedded SoCs are all around us—from everyday appliances to the latest smart glasses. ExecuTorch is a PyTorch-native framework for deploying neural networks on resource-constrained systems. In this session, we show how to build an end-to-end speech...
See More →

Speakers

George Gekov

ML Engineer, Arm

George Gekov is a Staff Software Engineer in Arm’s Machine Learning team, where he focuses on machine learning inference on embedded systems. He has extensive experience deploying neural networks on resource-constrained devices with Neural Processing Units (NPUs) to enable hardware-accelerated... Read More →

Aki Makkonen

Senior Staff Application Engineer, Alif Semiconductor

Software engineer with background in telecommunication, medical imaging, robotics and embedded systems.

Tuesday April 7, 2026 15:55 - 16:05 CEST
Founders Cafe

Frameworks & Compilers

Audience Level Any

15:55 CEST

Lightning Talk: Beyond Generic Spans: Distributed Tracing for Actionable LLM Observability - Sally O'Malley & Greg Pereira, Red Hat

Tuesday April 7, 2026 15:55 - 16:05 CEST

Master Stage

End-to-end observability is non-negotiable for production LLMs to track performance, attribute costs, and validate optimizations. Generating actionable traces from complex distributed inference remains a significant challenge. We implemented tracing for llm-d, a high-performance distributed LLM inference framework. Using manual OpenTelemetry instrumentation with carefully crafted spans at...
See More →

Speakers

Greg Pereira

Sr. Machine Learning Engineer, Red Hat

Greg began his career as SRE focusing on CICD and automation in the Emerging Technologies org at redhat. After transferring to the platform and services team he started from the ground up, refocusing on AI centric software development. Three years later he has been involved in building... Read More →

Sally O'Malley

Principal Software Engineer, Red Hat

Tuesday April 7, 2026 15:55 - 16:05 CEST
Master Stage

Inference & Production

Audience Level Intermediate

16:10 CEST

Build PyTorch to Understand PyTorch - Vijay Janapa Reddi, Harvard University; Andrea Mattia Garavagno, University of Genoa

Tuesday April 7, 2026 16:10 - 16:35 CEST

Central Room

PyTorch's success depends on more than users—it needs engineers who understand what's inside. Engineers who can debug framework issues, optimize at the systems level, contribute upstream, and build what comes next. But ML education today produces practitioners who call APIs without understanding them. They train models without knowing why Adam needs 3× the memory of SGD, or what happens when...
See More →

Speakers

Vijay Janapa Reddi

Professor, Harvard University

Vijay Janapa Reddi is a Professor at Harvard University, where he leads research at the intersection of machine learning and computer systems. He is the author of the open-source Machine Learning Systems textbook (mlsysbook.ai) and co-founder of MLCommons, the organization behind... Read More →

Andrea Mattia Garavagno

Research Fellow, University of Genoa & Scuola Superiore Sant'Anna

I am a Research Fellow holding a joint position at the University of Genoa and Scuola Superiore Sant'Anna. My research is centered on Edge AI, where I am currently working to automate the design of applications through Hardware-Aware Neural Architecture Search (NAS). By running these... Read More →

pytorchcon eu26 ppt TinyTorch pdf

Tuesday April 7, 2026 16:10 - 16:35 CEST
Central Room

Frameworks & Compilers

Audience Level Any
Slides Attached Yes

16:10 CEST

On-Device LLM Inference on Android With ExecuTorch and Qualcomm QNN - Shivay Lamba & Kartikey Rawat, Qualcomm

Tuesday April 7, 2026 16:10 - 16:35 CEST

Founders Cafe

Multimodal models like CLIP are typically deployed in the cloud due to their size and computational demands, limiting their use in latency-sensitive, privacy-preserving, and offline-first applications. This talk demonstrates how one can run fully on-device CLIP inference on Android using ExecuTorch with the Qualcomm QNN backend, enabling real-time vision–language understanding without server...
See More →

Speakers

Shivay Lamba

Senior ML Engineer, Qualcomm

Shivay Lamba is a software developer specializing in DevOps, Machine Learning and Full Stack Development.

He is an Open Source Enthusiast and has been part of various programs like Google Code In and Google Summer of Code as a Mentor and is currently a MLH Fellow. He has also worked at organizations like Amazon, EY, Genpact. He is a Tensorflow.JS SIG member and community lead from In... Read More →

Kartikey Rawat

Senior Developer Advocate, Qualcomm

Senior Developer Advocate at Qualcomm| Google Developer Expert in AI and Google Cloud

Tuesday April 7, 2026 16:10 - 16:35 CEST
Founders Cafe

GenAI & Multimodal

Audience Level Any

16:10 CEST

Optimizing Reinforcement Learning at Trillion-Parameter Scale - Songlin Jiang, Aalto University & Mind Lab

Tuesday April 7, 2026 16:10 - 16:35 CEST

Junior Stage

This talk will dive into how we implemented and optimized reinforcement learning on trillion-parameter Mixture-of-Experts reasoning models using veRL, Megatron-Bridge and vLLM. The session is useful to anyone building large-scale RL training systems.For the first part, I will walk through the system design required to make RL work at this scale using LoRA: how LoRA adapters are implemented for...
See More →

Speakers

Songlin Jiang

Doctoral Researcher, Aalto University & Mind Lab

I am a doctoral researcher at Aalto University, focusing on reducing training and inference latency for Reinforcement Learning and Large Language Models (LLMs) on High-Performance Computing (HPC) clusters. I am also a passionate free software developer, a maintainer of VeRL, and a... Read More →

1T RL PTC EU26 Songlin pdf

Tuesday April 7, 2026 16:10 - 16:35 CEST
Junior Stage

Training Systems

Audience Level Intermediate
Slides Attached Yes

16:10 CEST

TorchStore: What We Learned Building Distributed Storage Solutions for AysncRL - Lucas Pasqualin, Danielle Pintz, Allen Wang, Amir Afzail Meta

Tuesday April 7, 2026 16:10 - 16:35 CEST

Master Stage

Asynchronous Reinforcement Learning (AsyncRL) workloads have unique data sharing requirements: actors must efficiently exchange large tensors across processes and nodes, often with different sharding configurations—not just at checkpoint time, but continuously during training for live weight synchronization. This talk presents Torchstore, an open-source distributed tensor storage system built on...
See More →

Speakers

Lucas Pasqualin

ML Engineer, PyTorch (Meta)

Lucas has been developing Machine Learning Applications and Machine Learning infrastructure at scale for years, and has recently been focused on extending the product offering of PyTorch's Distributed Checkpointing stack.

Allen Wang

Software Engineer, Meta

Danielle Pintz

Software Engineer, Meta

Danielle is a software engineer working on PyTorch, currently focused on TorchStore and Async RL. She previously worked on the Llama Research team.

Amir Afzali

Software Engineer, Meta

Software engineer working on Pytorch distributed infra and large scale training

Tuesday April 7, 2026 16:10 - 16:35 CEST
Master Stage

Training Systems

Audience Level Intermediate

16:40 CEST

Lightning Talk: TerraKit: Standardising AI-Ready Geospatial Data Preparation for the TorchGeo Ecosystem - Rosie Lickorish & Romeo Kienzler, IBM

Tuesday April 7, 2026 16:40 - 16:50 CEST

Central Room

With the advent of geospatial foundation models, unexplored use cases are emerging that require well-curated datasets. Currently, no standardised approach exists for creating such AI-ready geospatial datasets. In this session, we introduce TerraKit: a comprehensive open-source Python library for retrieving, and processing geospatial data, that seamlessly integrates with upstream geospatial model...
See More →

Speakers

Romeo Kienzler

AI Research Engineer, IBM

Romeo is a data scientist working for IBM Research and an advocate for ethical machine learning, transparency and privacy

Rosie Lickorish

Research Software Engineer, IBM

Rosie is a Research Software Engineer at IBM, specializing in the development of next-generation tools and technologies designed to drastically accelerate solutions for today’s most urgent global challenges. Her technical focus involves leveraging geospatial data, AI models... Read More →

TerraKit Standardizing AI Ready Geospatial Data Preparation for the TorchGeo Ecosystem pdf

Tuesday April 7, 2026 16:40 - 16:50 CEST
Central Room

GenAI & Multimodal

Audience Level Any
Slides Attached Yes

16:40 CEST

Optimizing PyTorch on CPU-GPU Coherent Platforms - Matthias Jouanneaux, Nvidia

Tuesday April 7, 2026 16:40 - 17:05 CEST

Founders Cafe

In recent years, both Nvidia and AMD have introduced hardware coherent platforms: GH200, GB200 and MI300A. These coherent platforms provide both many new features and challenges for PyTorch applications attempting to make the most out of the platform. This talk will focus on Nvidia's GB200 and walk through techniques to utilize the features of the coherent architecture in PyTorch, such as the high...
See More →

Speakers

Matthias Jouanneaux

Sr Software Engineer - PyTorch, NVIDIA

After his master’s degree, Matthias Jouanneaux worked at Konica Minolta's european research lab on medical image analysis using deep learning for 2 years.
He then joined Nvidia, focusing on optimizing application performance for Nvidia hardware as a Developer Technology enginee... Read More →

PyTorchEU26 Optimizing PyTorch on CPU GPU coherent platforms Matthias Jouanneaux pdf

Tuesday April 7, 2026 16:40 - 17:05 CEST
Founders Cafe

Frameworks & Compilers

Audience Level Advanced
Slides Attached Yes

16:40 CEST

Securing Agentic AI With PyTorch: Threat Modeling & LLM Red Teaming in Practice - Valeri Milke, VamiSec GmbH

Tuesday April 7, 2026 16:40 - 17:05 CEST

Junior Stage

Agentic AI systems built with PyTorch introduce a new security paradigm: autonomous decision-making, tool usage, memory, and multi-step reasoning significantly expand the attack surface beyond traditional ML pipelines. This session presents a practical, security-first approach to building and testing agentic AI systems using PyTorch, combining AI threat modeling and hands-on LLM security testing....
See More →

Speakers

Valeri Milke

CEO, VamiSec GmbH

Valeri Milke is an AI security and cybersecurity specialist focusing on secure AI and agentic system design. He works at the intersection of PyTorch-based AI engineering, threat modeling and LLM security testing. His work includes AI red teaming, prompt injection analysis and the... Read More →

Tuesday April 7, 2026 16:40 - 17:05 CEST
Junior Stage

Security & Privacy

Audience Level Intermediate

16:55 CEST

Lightning Talk: Bayesian Neural Networks With Variational Inference in PyTorch - Lars Heyen, Karlsruhe Instute of Technology, Scientific Computing Center

Tuesday April 7, 2026 16:55 - 17:05 CEST

Central Room

Uncertainty quantification is becoming more and more important as neural networks are used for increasingly critical tasks. Bayesian neural networks (BNNs) inherently provide a measure of their own uncertainty, but can be either hard to implement or inflexible if one uses common frameworks. In this session I discuss how to efficiently implement BNNs using Variational Inference within PyTorch and...
See More →

Speakers

Lars Heyen

PostDoc, Karlsruhe Institute of Technology

I am a postdoctoral researcher working on uncertainty quantification in the research group "Robust and Efficient AI" at the Scientific Computing Center of the Karlsruhe Institute of Technology. I also coauthored the PyTorch-based library torch_blue for implementing Bayesian neural... Read More →

Slides Intro Variational Inference Lars Heyen pdf

Tuesday April 7, 2026 16:55 - 17:05 CEST
Central Room

Frameworks & Compilers

Audience Level Any
Slides Attached Yes

17:05 CEST

Flare Party

Tuesday April 7, 2026 17:05 - 18:30 CEST

Open Platform

Wrap up Day 1 of PyTorch Conference Europe 2026 at our official Flare Party. It’s the perfect opportunity to unwind, network, and keep the day’s momentum going.Enjoy complimentary beer, wine, and appetizers as you connect with speakers, core contributors, and fellow developers. Throughout the evening, explore the Poster Sessions, where presenters will be available for live Q&A to spark deeper...
See More →

Tuesday April 7, 2026 17:05 - 18:30 CEST
Open Platform

Breaks/Exhibits/Special Events

17:05 CEST

Poster Presentations: Applications & Case Studies

Tuesday April 7, 2026 17:05 - 18:35 CEST

Open Platform

LegoLoaderX: a PyTorch DataLoader for Sparse Spatio-Temporal Data - Michelle Audirac, Harvard University
Stress State Estimation from Deformed Surface Images Using Deep Learning - Bakhtiyar Mammadli, NOMATEN Centre of Excellence, National Centre for Nuclear Research

Speakers

Michelle Audirac

Senior Data Scientist, Harvard University

Bakhtiyar Mammadli

PhD Student, NOMATEN Centre of Excellence, National Centre for Nuclear Research

Bakhtiyar Mammadli is a PhD student in Mechanical Engineering at the NOMATEN Centre of Excellence (NCBJ, Poland). His research focuses on applying machine learning to experimental mechanics, particularly unsupervised methods for analyzing strain fields from Digital Image Correlation... Read More →

Tuesday April 7, 2026 17:05 - 18:35 CEST
Open Platform

Poster Presentations, Applications & Case Studies

17:05 CEST

Poster Presentations: Frameworks & Compilers

Tuesday April 7, 2026 17:05 - 18:35 CEST

Open Platform

Automatic Comm-Compute Overlap and Bucketing in torch.compile - Elias Ellison & Ivan Kobzarev, MetaFlexible Custom Operators: custom ops with arbitrary inputs and outputs - Angela Yi & Richard Zou, MetaHow Your Code Becomes a Kernel - Harshita Varma, Juspay; Nikita Verma, IndividualTorchCodec: The Easy and Efficient Media Decoding Library for PyTorch - Daniel Flores & Molly Xu,...
See More →

Speakers

Angela Yi

Software Engineer, Meta

Angela has been on the PyTorch Compiler team for the past 3 years, working on torch.export and AOTInductor.

Richard Zou

Software Engineer, Meta

I work on PyTorch.

William Wen

Software Engineer, Meta

William works on the torch.compile team, specializing in TorchDynamo.

Nikita Verma

cloud native developer, Indian Institute of Technology bhubaneswar

Nikita Verma is an active contributor to the open-source community with a strong focus on Kubernetes and cloud-native technologies. She worked on developing forest growth simulations, automating configuration generation, and integrating CI/CD workflows. Nikita has volunteered at KubeCon... Read More →

Harshita Varma

Product Manager, Juspay

Harshita Varma is a contributor to the Kubernetes project, actively involved in the SIG Contributor Experience community, with a focus on enhancing the contributor journey. She began her open-source journey by contributing to the Thanos project, sparking her passion for open source... Read More →

Daniel Flores

Software Engineer, Meta

Daniel is a software Engineer at Meta working on Torchcodec. Previously, Daniel studied computer science at Brown University.

Ivan Kobzarev

Software Engineer, Meta

Elias Ellison

Software Engineer, Meta

Elias has been working on the PyTorch team for four years, most recently on the torch.compile stack

Paul Zhang

Software Engineer, Meta

Paul Zhang is currently a software engineer working on PyTorch and Triton at Meta, ensuring that PyTorch and PT2 best utilizes the hardware it is run on. Previous to this, Paul has done extensive work on recommendation systems for training and inference, optimizing performance and... Read More →

Molly Xu

Software Engineer, Meta

Akash Agrawal

Software Engineer - II, Fujitsu Research of India Private Limited

Akash is a Software Engineer in Fujitsu Research of India, working actively on AI Framework Software Stack Optimization and Open-Source Software developments for FUJITSU-MONAKA – a 2 nanometer Armv9-A architecture-based CPU, for handling AI/HPC workloads and energy efficient co... Read More →

Tuesday April 7, 2026 17:05 - 18:35 CEST
Open Platform

Poster Presentations, Frameworks & Compilers

17:05 CEST

Poster Presentations: GenAI & Multimodal

Tuesday April 7, 2026 17:05 - 18:35 CEST

Open Platform

Unifying Modalities: Building Efficient Video Flows with PyTorch and Diffusion Transformers - David Brewster, Red Hat

Speakers

David Brewster

Principal Software Engineer, Red Hat

Tuesday April 7, 2026 17:05 - 18:35 CEST
Open Platform

Poster Presentations, GenAI & Multimodal

17:05 CEST

Poster Presentations: Inference & Production

Tuesday April 7, 2026 17:05 - 18:35 CEST

Open Platform

A Tale of Two DSLs: A Comparative Study of vLLM GPU Performance with cuTile and CuTe DSL - Anil Vishnoi & Matthew Odden, Red HatBringing BitNet to ExecuTorch via Vulkan - Marcus Edel & Vineet Suryan, CollaboraBuilding Production-Grade PyTorch Inference Pipelines for 100K+ Heterogenous Devices - Samaresh Kumar Singh, HP Inc.Feather: Software Emulated FP8 for Older GPUs - Suriyaa MM, Indian...
See More →

Speakers

Dave Grove

Distinguished Research Scientist, IBM

David Grove is a Distinguished Research Scientist at IBM T.J. Watson, NY, USA. He has been a software systems researcher at IBM since 1998, specializing in programming language implementation and scalable runtime systems. He has authored more than sixty peer-reviewed publications... Read More →

Olivier Tardieu

Principal Research Scientist, Manager, IBM

Dr. Olivier Tardieu is a Principal Research Scientist and Manager at IBM T.J. Watson, NY, USA. He joined IBM Research in 2007. His current research focuses on cloud-related technologies, including Serverless Computing and Kubernetes, as well as their application to Machine Learning... Read More →

Shivay Lamba

Senior ML Engineer, Qualcomm

Marcus Edel

Machine Learning Lead, Collabora

Marcus Edel is the the machine-learning lead at Collabora, where he leads the effort to optimise and apply deep networks for inference, with a focus on embedded devices. Marcus completed his graduate studies in 2020 with a focus on fast algorithms for core machine learning tasks applied... Read More →

Anil Vishnoi

Principal Software Engineer, RedHat Inc

Anil has been doing research, design and development of software networking products for more than 15 years at RedHat and his prior employers. Most of his career he has been working in Software Defined Networks, Data Center Networking, Network Virtualization and Cloud Networking domain... Read More →

Rudraksh Karpe

Forward Deployed Engineer, Simplismart

Rudraksh is FDE at Simplismart, where he builds solutions focused on high-performance AI inference. He previously worked as an AI Engineer at ZS Associates. He was a two-time Google Summer of Code participant with the openSUSE Project and

He has presented internationally at events including OpenSearch Korea, openSUSE Conference, Early Adopter Tech Summit, PyCon US, PyCon Japan, and openSUSE Asia Summit, focusing on GenAI, open source, and cloud-native technologies... Read More →

Samaresh Kumar Singh

Principal Engineer, HP Inc.

Samaresh Kumar Singh is an engineering principal at HP Inc. with more than 21 years of experience in designing and implementing large-scale distributed systems, cloud native platform systems, and edge AI / ML systems. His expertise includes agentic AI systems, GenAI / LLMs, Edge AI... Read More →

Suriyaa MM

Student, Indian Institute of Technology Tirupati

Daniil Lyakhov

AI Research Engineer/Scientist, Intel corporation

Felix Marty

Senior Software Engineer, AMD

Felix Marty is a software engineer specialized in deep learning model compression, working on AMD Quark open-source model compression toolkit, and contributing to algorithms, evaluations, hardware deployment and open-source integrations. Prior to AMD, he used to work at Hugging Face... Read More →

Vineet Suryan

Senior Software Engineer, Collabora

Aamir Nazir

Research Engineer, Intel

Mathew Odden

Principle Software Engineer, Red Hat

Tuesday April 7, 2026 17:05 - 18:35 CEST
Open Platform

Poster Presentations, Inference & Production

17:05 CEST

Poster Presentations: Responsible AI & Compliance

Tuesday April 7, 2026 17:05 - 18:35 CEST

Open Platform

When Models Collaborate but Data Cannot: Explainable Ensemble Learning Under Privacy Constraints - Pavani Rajula, NeuCorelytix Solutions LLP

Speakers

Pavani Rajula

AI Developer, NeuCorelytix Solutions LLP

I’m Pavani Rajula, a Data Science and AI Developer working with Data Migration International AG, currently working remotely from India. I have nearly six years of experience in data engineering, machine learning and artificial intelligence, including two years of professional experience... Read More →

Tuesday April 7, 2026 17:05 - 18:35 CEST
Open Platform

Poster Presentations, Responsible AI & Compliance

18:30 CEST

Open Source AI Soirée hosted by Label Studio and Docling

Tuesday April 7, 2026 18:30 - 21:00 CEST

TBA

Join Label Studio and Docling for an evening of conversation, connection, and community during PyTorch Conf EU.Whether you're working on training pipelines, document workflows, evaluation systems, or production AI infrastructure, this gathering is a chance to meet peers, exchange ideas, and connect with others building real-world AI.The evening will bring together the technical founder of Label...
See More →

Tuesday April 7, 2026 18:30 - 21:00 CEST
TBA

Breaks/Exhibits/Special Events

08:00 CEST

Registration & Badge Pick-Up

Wednesday April 8, 2026 08:00 - 15:25 CEST

Lobby

Wednesday April 8, 2026 08:00 - 15:25 CEST
Lobby

Breaks/Exhibits/Special Events

08:00 CEST

Community Expo

Wednesday April 8, 2026 08:00 - 15:40 CEST

Open Platform

Wednesday April 8, 2026 08:00 - 15:40 CEST
Open Platform

Breaks/Exhibits/Special Events

09:00 CEST

Keynote: PyTorch CTO - Matt White, Global CTO of AI, Linux Foundation

Wednesday April 8, 2026 09:00 - 09:10 CEST

Master Stage

Matt White, Global CTO of AI and CTO at PyTorch Foundation will provide an update on technical strategy, ecosystem and projects and working groups

Speakers

Matt White

Global CTO of AI, Linux Foundation, The Linux Foundation

Matt White is the Executive Director of the PyTorch Foundation and GM of AI at the Linux Foundation. He is also the Director of the Generative AI Commons. Matt has years of experience in applied research and standards in AI and data in telecom, media and gaming industries. Matt is... Read More →

1. Matt White pdf

Wednesday April 8, 2026 09:00 - 09:10 CEST
Master Stage

Keynote Sessions

Audience Level Any
Slides Attached Yes

09:10 CEST

Keynote: vLLM & Ray Updates - Tyler Michael Smith, Chief Architect - Inference Engineering, Red Hat & Artur Niederfahrenhorst, Member of Technical Staff,Anyscale

Wednesday April 8, 2026 09:10 - 09:25 CEST

Master Stage

Speakers

Tyler Michael Smith

Chief Architect - Inference Engineering, Red Hat

Artur Niederfahrenhorst

Member of Technical Staff, Anyscale

Artur is a member of the technical staff at Anyscale, the company that recently donated Ray to the Linux Foundation. He has been contributing to Ray since early 2022, where his main contributions have been in distributed reinforcement learning. Artur majored in Computer Science at... Read More →

3. Artur Niederfahrenhorst pdf

2. Tyler Michael Smith pdf

Wednesday April 8, 2026 09:10 - 09:25 CEST
Master Stage

Keynote Sessions

Audience Level Any
Slides Attached Yes

09:25 CEST

Keynote: The Hub as Infrastructure. From Open PyTorch Models, to a Safe and Performant Distribution Hub - Lysandre Debut, Chief Open-Source Officer, Hugging Face

Wednesday April 8, 2026 09:25 - 09:40 CEST

Master Stage

Speakers

Lysandre Debut

Chief Open-Source Officer, Hugging Face

Lysandre is the Chief Open-Source Officer at Hugging Face; ensuring that the ecosystem is as well supported as possible in the ML lifecycle, with open-source tools.

He has been at Hugging Face for the past six years and was the first open-source employee at Hugging Face; working on transformers and the entire stack of Hugging Face open-source libraries since then... Read More →

4. Lysandre Debut pdf

Wednesday April 8, 2026 09:25 - 09:40 CEST
Master Stage

Keynote Sessions

Audience Level Any
Slides Attached Yes

09:45 CEST

Sponsored Keynote: Open Source Infrastructure for the AI Native Era - Jonathan Bryce, Executive Director, Cloud Native Computing Foundation

Wednesday April 8, 2026 09:45 - 09:50 CEST

Master Stage

AI adoption will not be limited by model ideas alone. It will be limited by how fast we can deploy, secure, observe, and scale AI systems in production. Inference is where AI becomes real for most organizations. As AI moves from frontier labs into mainstream production, the operational challenges start to look increasingly cloud native: orchestration, autoscaling, routing, security, policy, and...
See More →

Speakers

Jonathan Bryce

Executive Director, Cloud and Infrastructure, The Linux Foundation

Jonathan Bryce is the Executive Director of Cloud & Infrastructure at the Linux Foundation, where he leads both the Cloud Native Computing Foundation (CNCF) and the OpenInfra Foundation—two of the largest and most influential open source communities in the world. With over... Read More →

5. Jonathan Bryce pdf

Wednesday April 8, 2026 09:45 - 09:50 CEST
Master Stage

Keynote Sessions

Audience Level Any
Slides Attached Yes

09:50 CEST

Keynote: Gemma 4: Compacting Intelligence for the Edge - Léonard Hussenot, Research Scientist, Google Deepmind

Wednesday April 8, 2026 09:50 - 10:05 CEST

Master Stage

This talk explores the philosophy and engineering behind Gemma 4, arguing that the future of AI isn't only about size, but about "intelligence per byte."We will dive into why compacting intelligence—maximizing the reasoning and instruction following ability of every single token—is the ultimate bottleneck for truly useful AI. By optimizing for token efficiency and memory footprints, we unlock...
See More →

Speakers

Leonard Hussenot

Research Scientist, Google Deepmind

I am a Research Scientist at Google DeepMind, where I lead the Gemma post-training team focused on developing the most useful compact models for on-device applications. Since joining Google Brain, I have contributed to the evolution of Bard, Gemini, and Gemma, specializing in scaling... Read More →

Wednesday April 8, 2026 09:50 - 10:05 CEST
Master Stage

Keynote Sessions

Audience Level Any

10:05 CEST

Birds of A Feather: Disaggregated Tokenization: Building Toward Tokens-In-Tokens-Out LLM Inference - Maroon Ayoub, IBM Research; Hang Yin & Xi Ning Wang, Alibaba Cloud; Nili Guy, IBM; Hyunkyun Moon, Moreh

Wednesday April 8, 2026 10:05 - 10:35 CEST

Open Platform

LLMs are token-in, token-out - but our serving stacks aren't. Tokenization and preprocessing are still locked inside the inference engine, blocking the cache-aware routing and encode/prefill/decode (E/P/D) disaggregation that production deployments demand. To route smart, you need tokens before you reach the backend - and with multi-modal inputs requiring heavy encode-stage preprocessing, this is...
See More →

Speakers

Xi Ning Wang

Senior Technical Expert, Alibaba Cloud

Wang Xining, senior technical expert of Alibaba Cloud, focusing on MaaS/LLM, Kubernetes, service mesh and other advanced cloud native technical strategies. Previously worked in the IBM as tech architect focusing on SOA/Cloud and served as the chairman of the Patent Technology Review... Read More →

Hang Yin

Senior R&D Engineer, Alibaba Cloud

Hang Yin, senior engineer of Alibaba Cloud, focusing on Kubernetes, service mesh, Gateway API Inference Extension and other cloud native fields. Currently served in the Alibaba Cloud Container Service for Kubernetes (ACK) team, responsible for the developing of ACK Gateway with Inference... Read More →

Maroon Ayoub

Research Scientist & Architect, IBM Research

Nili Guy

IBM Research, IBM

Nili is a Research Manager and Senior Technical Staff Member at IBM Research, co-creator of llm-d, and an expert in distributed inference and Kubernetes-native AI systems. She has led key open-source and productized inference initiatives across IBM’s AI platforms.

hyunkyun moon

MLOps Engineer, Moreh

Hyunkyun Moon is an ML Platform Engineer at Moreh, focusing on building high-performance LLM inference platforms with llm-d. He is an active contributor to open-source projects, including llm-d and vLLM. With a strong background in large-scale Kubernetes-native infrastructure, he... Read More →

Wednesday April 8, 2026 10:05 - 10:35 CEST
Open Platform

Birds of A Feather

Audience Level Beginner

10:05 CEST

Coffee Break

Wednesday April 8, 2026 10:05 - 10:35 CEST

Open Platform

Menu:
-Brioche
-Granola bar (Gluten Free, Vegan)
-Seasonal fruits (Gluten Free, Vegan)
-Roasted pumpkin cake
-Dry fruits and dry grapes mix (Gluten Free, Vegan)

Wednesday April 8, 2026 10:05 - 10:35 CEST
Open Platform

Breaks/Exhibits/Special Events

10:05 CEST

Meet the vLLM Maintainers

Wednesday April 8, 2026 10:05 - 10:35 CEST

Open Platform

Meet the core maintainers of vLLM at this session! Come and discuss use cases, features, roadmap with us, or just learn how the vLLM development happens under the hood.

Speakers

Tyler Michael Smith

Chief Architect - Inference Engineering, Red Hat

Nicolò Lucchesi

Senior Machine Learning Engineer, Red Hat

Nicolò is a Senior Machine Learning Engineer at Red Hat with a background in Deep Learning and Computer Vision. He works on Inference Optimization for vLLM, where he is a maintainer.

Wednesday April 8, 2026 10:05 - 10:35 CEST
Open Platform

Meet the Developers

Audience Level Any

10:25 CEST

Sponsor Activity - Validating AI on CPUs: The vLLM 3-Phase Evaluation Framework

Wednesday April 8, 2026 10:25 - 10:40 CEST

Open Platform

Stop guessing your hardware capabilities. This automated test engine benchmarks vLLM on CPUs through controlled, realistic, and production phases, delivering precise metrics on throughput, latency, and optimal KV cache sizing. Join us for a demo!Sponsor: Red HatLocation: Red Hat within the Community ShowcaseIn order to facilitate networking and business relationships at the event, you may choose...
See More →

Wednesday April 8, 2026 10:25 - 10:40 CEST
Open Platform

Breaks/Exhibits/Special Events

10:35 CEST

Lightning Talk: Monarch: An API To Your Supercomputer - Marius Eriksen, Meta

Wednesday April 8, 2026 10:35 - 10:45 CEST

Master Stage

The training systems driving today’s most advanced AIs are distributed, dynamic, and complex. Pre-training relies on layered parallelism and careful fault isolation. Post-training RL spans thousands of GPUs while coordinating verifiers, compilers, and code execution. Systems complexity pulls focus away from the core algorithms: developers are forced to assemble systems from schedulers, RPC...
See More →

Speakers

Marius Eriksen

Software Engineer, Meta

Marius Eriksen is a software engineer at Meta, where he works on infrastructure for large-scale training systems.

Wednesday April 8, 2026 10:35 - 10:45 CEST
Master Stage

Frameworks & Compilers

Audience Level Any

10:35 CEST

Lightning Talk: Live Migration of PyTorch GPU Nodes From Azure To European Clouds - Mike Krom, Acf Cyber Solutions

Wednesday April 8, 2026 10:35 - 10:45 CEST

Central Room

Many European PyTorch teams run their GPU workloads on hyperscalers like Azure, AWS, or GCP—often without realizing that this places their data and models under US jurisdiction. This lightning talk shows how PyTorch compute nodes can be migrated to European cloud providers while keeping the full ML environment intact. Through a live demo, we migrate a GPU-enabled PyTorch VM—including CUDA...
See More →

Speakers

Mike Krom

Partner, ACF Cybersolutions

I am a software architect and lead developer of the open-source project DigitalNomadSky. I have extensive experience with Microsoft Azure from working at Microsoft and supporting large-scale cloud migrations. My work focuses on supporting datascience and ML-teams with cloud infrastructure... Read More →

mike krom migration vms pdf

Wednesday April 8, 2026 10:35 - 10:45 CEST
Central Room

Security & Privacy

Audience Level Intermediate
Slides Attached Yes

10:35 CEST

Beyond JSON-RPC: Scaling Model Context Protocols With gRPC in the PyTorch Ecosystem - Ashesh Vidyut & Madhav Bissa, Google

Wednesday April 8, 2026 10:35 - 11:00 CEST

Junior Stage

Right now, MCP mostly relies on HTTP and STDIO. That works for simple scripts, but if you’re running high-performance PyTorch models in production, you’re going to hit a wall. When you’re moving large context windows or tensor metadata, the overhead of JSON-RPC starts to hurt. We’re introducing SEP-1352, which adds gRPC as a native transport for MCP. Since gRPC is already the standard for...
See More →

Speakers

Ashesh Vidyut

Senior Software Engineer, Google

Madhav Bissa

Senior Software Engineer, Google

member, grpc-Go

PDF Beyond JSON RPC Scaling Model Context Protocols with gRPC in Pytorch Ecosystem Google Slides pdf

Wednesday April 8, 2026 10:35 - 11:00 CEST
Junior Stage

Agents & Interop

Audience Level Intermediate
Slides Attached Yes

10:35 CEST

How To Write C++ Extensions in 2026 - Jane Xu, Meta & Mikayla Gawarecki, Meta

Wednesday April 8, 2026 10:35 - 11:00 CEST

Founders Cafe

Are you writing a C++ custom op extension to PyTorch? It's 2026 and are you still shipping M x N wheels for M CPython versions and N libtorch versions? Did you know you can just ship 1 wheel that works across multiple CPythons and libtorches? If you're curious how, attend this talk to get the deets on py_limited_api, APIs like torch::stable::Tensor & TORCH_TARGET_VERSION, and generally the latest...
See More →

Speakers

Jane Xu

PyTorch SWE, Meta

Hi, I'm Jane! Please don't hesitate to come talk to me about your favorite optimizer, fitting models in GPU memory, how to free C++ extensions from libtorch version, and anything that interests you.

Mikayla Gawarecki

Software Engineer, Meta

Software Engineer on PyTorch

PTC EU 2026 How to Write C++ Extensions in 2026 pdf

Wednesday April 8, 2026 10:35 - 11:00 CEST
Founders Cafe

Frameworks & Compilers

Audience Level Beginner
Slides Attached Yes

10:50 CEST

Lightning Talk: Achieving SOTA GEMM Performance: A CuTeDSL Backend for PyTorch Inductor - Nikhil Patel, Meta

Wednesday April 8, 2026 10:50 - 11:00 CEST

Master Stage

Matrix multiplication is a central compute primitive in modern deep learning, but achieving SOTA performance on novel architectures like NVIDIA Blackwell has become a bottleneck. Existing Triton-based kernels in torch.compile struggle to keep pace with rapid hardware evolution, often forcing users to hand-write custom, architecture-specific kernels - a growing gap as hardware feature velocity...
See More →

Speakers

Nikhil Patel

Software Engineer, Meta

Nikhil is a software engineer on the PyTorch Inductor team at Meta Superintelligence Labs, where he works on Inductor’s CuTeDSL GEMM backend. His work sits at the boundary between compiler code generation and hardware-native GPU features, optimizing large-scale training and inference... Read More →

CuTeDSL Inductor Backend pdf

Wednesday April 8, 2026 10:50 - 11:00 CEST
Master Stage

Frameworks & Compilers

Audience Level Intermediate
Slides Attached Yes

10:50 CEST

Lightning Talk: Step-Aligned Telemetry for Distributed PyTorch Training (Time & Memory Attribution Across Ranks) - Abhinav Srivastav, TraceOpt

Wednesday April 8, 2026 10:50 - 11:00 CEST

Central Room

Distributed PyTorch training often looks healthy in system dashboards; GPU utilization is high, memory is stable and yet throughput degrades, steps jitter, or GPUs go idle intermittently. The core issue is misalignment: most telemetry is sampled by time, while training progresses by "steps", and distributed behavior is dominated by the slowest rank rather than averages. In this talk I will...
See More →

Speakers

Abhinav Srivastav

ML Scientist, TraceOpt

ML researcher with a PhD in Computer Science. Industry experience at IBM Research, Huawei Research, and Zalando.Currently building TraceML: an open source tool that shows you the step-level breakdown of your PyTorch training run while it's still running.I am partially interested in... Read More →

Pytorch conf pdf

Wednesday April 8, 2026 10:50 - 11:00 CEST
Central Room

Training Systems

Audience Level Advanced
Slides Attached Yes

11:05 CEST

Lightning Talk: Accelerating PyTorch Models With Torch.compile's C++ Wrapper Mode - Bin Bao, Meta

Wednesday April 8, 2026 11:05 - 11:15 CEST

Junior Stage

This lightning talk introduces torch.compile's C++ wrapper mode, a powerful feature that reduces CPU overhead and significantly improves model performance. As modern GPUs become increasingly powerful and compiler optimizations make GPU kernels run faster, CPU overhead has become more visible as the bottleneck. By generating optimized C++ code instead of Python, cpp-wrapper mode directly tackles...
See More →

Speakers

Bin Bao

Software Engineer, Meta

Bin Bao is a software engineer working with the PyTorch Compiler team at Meta. He focuses on developing TorchInductor optimizations and AOTInductor for C++ deployment.

Accelerating PyTorch Models With torch.compile's Cpp Wrapper Mode.pptx pdf

Wednesday April 8, 2026 11:05 - 11:15 CEST
Junior Stage

Frameworks & Compilers

Audience Level Intermediate
Slides Attached Yes

11:05 CEST

Lightning Talk: KV-Cache Centric Inference: Building a State-Aware Serving Platform With Llm-d and VLLM - Maroon Ayoub & Martin Hickey, IBM Research

Wednesday April 8, 2026 11:05 - 11:15 CEST

Central Room

We’ve spent years optimizing LLM inference around compute - faster kernels, better batching, smarter parallelism. But in production, the bottleneck increasingly isn’t FLOPs. It’s state. Specifically, the KV-cache: the attention state that makes the difference between a 4-second prefill and a sub-second cache hit. Lose it to eviction, isolate it on a single node, or fail to route to it - and...
See More →

Speakers

Martin Hickey

Senior Technical Staff Member, IBM Research

Martin Hickey is a STSM at IBM Research, focused on Open Source, Cloud Native Computing, and AI. Martin has notable contributions to open source projects like vLLM, LMCache, Kubernetes, Helm, OpenTelemetry and OpenStack. Martin is a core maintainer for LMCache and an emeritus core... Read More →

Maroon Ayoub

Research Scientist & Architect, IBM Research

pytorch eu kv cache pptx

Wednesday April 8, 2026 11:05 - 11:15 CEST
Central Room

Inference & Production

Audience Level Any

11:05 CEST

Bringing PyTorch Monarch to AMD GPUs: Single-Controller Distributed Training on ROCm - Liz Li & Zachary Streeter, AMD

Wednesday April 8, 2026 11:05 - 11:30 CEST

Founders Cafe

PyTorch Monarch introduces a new distributed programming paradigm that enables developers to orchestrate entire GPU clusters from a single Python program. With its actor-based runtime, process mesh abstraction, and asynchronous execution model, Monarch simplifies large-scale distributed training and enables complex workflows that combine training, evaluation, and reinforcement learning within one...
See More →

Speakers

Liz Li

Principal AI engineer, AMD

Liz Li is a Principal AI Engineer in the AMD AI group, specializing in enabling and optimizing cutting-edge AI models on AMD Instinct GPUs for both distributed inference and training. With over 10 years of experience in computer, graphics, and AI architecture, she has previously led... Read More →

Zachary Streeter

Senior Member of Technical Staff, AMD

I'm a computational physicist working in the field of AI the past 5 years. I have a wide range of expertise from mathematics to performance optimizations and system engineering. Feel free to nerd out with me! Please connect with me on LinkedIn.

Monarch PTC v1 pptx

Wednesday April 8, 2026 11:05 - 11:30 CEST
Founders Cafe

Training Systems

Audience Level Any

11:05 CEST

Fp8 Training From Hopper To Blackwell - Luca Wehrstedt, Meta

Wednesday April 8, 2026 11:05 - 11:30 CEST

Master Stage

The Hopper generation of NVIDIA GPUs first enabled the use of low-precision float8 data types for training via TensorCore acceleration. However, the recipe to best leverage it was far from settled. Practitioners had to find their way through many entangled decisions around accuracy-vs-efficiency, precision-vs-range, overflows-vs-underflows, and more. The frontier was further push forward by the...
See More →

Speakers

Luca Wehrstedt

Software Engineer, Meta

Research Engineer in Meta's Fundamental AI Research team (FAIR). At the intersection of research and infrastructure, Luca specialized in training efficiency and distributed communication. Regular contributor to PyTorch.

Fp8 at PT conf (1) pdf

Wednesday April 8, 2026 11:05 - 11:30 CEST
Master Stage

Training Systems

Audience Level Advanced
Slides Attached Yes

11:20 CEST

Lightning Talk: Building AI That Ops Teams Actually Trust - Robert King, Chronosphere / Palo Alto Networks

Wednesday April 8, 2026 11:20 - 11:30 CEST

Junior Stage

You've built an AI that identifies root causes of incidents faster than any human could... but there's one problem, no one trusts it. Ops teams are skeptical by nature. They've been burned by noisy alerts, black-box tools, and "intelligent" systems that weren't. This talk covers what we learned building AI for incident response across enterprise environments: why technically correct...
See More →

Speakers

Robert King

Senior Sales Engineer, Chronosphere

Robert is Lead Enterprise Solutions Engineer at Chronosphere and an OpenTelemetry contributor. He recently presented on AI Observability with OpenTelemetry at Cloud Native London https://www.youtube.com/live/qF4wz-pha1w?si=PFzjNcGkbD4pFKnA&t=625 and has spoken at AWS Summit, and other... Read More →

Building AI That Ops Teams Actually Trust pdf

Wednesday April 8, 2026 11:20 - 11:30 CEST
Junior Stage

Inference & Production

Audience Level Beginner
Slides Attached Yes

11:20 CEST

Lightning Talk: Not All Tokens Are Equal: Semantic KV-Cache for Agentic LLM Serving - Maroon Ayoub, IBM Research & Hyunkyun Moon, moreh

Wednesday April 8, 2026 11:20 - 11:30 CEST

Central Room

Agentic AI workloads - tree-of-thought exploration, ReAct loops, hierarchical swarms - expose a fundamental mismatch in how we serve PyTorch models. Today's inference stacks treat the KV-cache as a flat, anonymous tensor buffer with blind LRU eviction. This ignores the structural reality of agents: system prompts are durable, tool definitions are shared, and reasoning scratchpads are ephemeral. We...
See More →

Speakers

Maroon Ayoub

Research Scientist & Architect, IBM Research

hyunkyun moon

MLOps Engineer, Moreh

not all tokens are equal pytorch eu pptx

Wednesday April 8, 2026 11:20 - 11:30 CEST
Central Room

Inference & Production

Audience Level Intermediate

11:35 CEST

Lightning Talk: Enabling the Audio Modality for Language Models - Eustache Le Bihan, Hugging Face

Wednesday April 8, 2026 11:35 - 11:45 CEST

Founders Cafe

As the maintainer of everything audio in `transformers` (the lib), this talk shares how audio is being integrated into large language models, grounded in what we observe from the OS ecosystem. Beginning with a brief overview of the current landscape of Audio LMs, I'll then highlight emerging trends in how audio is incorporated into pretrained text backbones. In particular, we examine the growing...
See More →

Speakers

Eustache Le Bihan

MLE, Hugging Face

A 2024 MVA graduate, I now work on open-source audio at Hugging Face. My current focus is on standardising audio in the transformers library and strengthening support across models.

Wednesday April 8, 2026 11:35 - 11:45 CEST
Founders Cafe

GenAI & Multimodal

Audience Level Intermediate

11:35 CEST

Accelerating Complex-Valued Tensors With Torch.compile - Hameer Abbasi, OpenTeams Inc.

Wednesday April 8, 2026 11:35 - 12:00 CEST

Junior Stage

torch.compile has been invaluable in accelerating many machine learning and scientific computing workflows. It has become a one-shot way to get free performance for many kinds of programs and models. However, it comes with its own set of limitations. One of these limitations is that, for a long time, torch.compile didn't accept complex-valued tensors. These tensors have many uses, from quantum...
See More →

Speakers

Hameer Abbasi

Senior Software Engineer I, OpenTeams, Inc.

Hameer Abbasi is a Senior Software Developer at OpenTeams, Inc. As part of his day job and also as a hobby, he has contributed to various projects in the scientific computing space, including NumPy, SciPy and PyTorch. He is also the lead maintainer of PyData/Sparse, a library for... Read More →

PyTorch EU 2026 Complex pdf

Wednesday April 8, 2026 11:35 - 12:00 CEST
Junior Stage

Frameworks & Compilers

Audience Level Intermediate
Slides Attached Yes

11:35 CEST

Optimizing Large MoE Inference on NVIDIA Blackwell: NVFP4, ADP, and DualPipe Strategies - Julien Demouth, NVIDIA

Wednesday April 8, 2026 11:35 - 12:00 CEST

Central Room

Deploying massive Mixture-of-Experts (MoE) architectures like DeepSeek-V3/R1 requires a co-designed approach leveraging NVIDIA Blackwell’s fifth-generation Tensor Cores. This session details the transition to NVFP4 precision for MoE weights to significantly reduce memory load, coupled with FP4/FP8 KV caching to minimize attention layer footprint and enable higher concurrency. We will analyze the...
See More →

Speakers

Julien Demouth

Senior Distinguished Engineer - Eng. Lead for AI Labs & Models, NVIDIA

Wednesday April 8, 2026 11:35 - 12:00 CEST
Central Room

Inference & Production

Audience Level Advanced

11:35 CEST

Portable High‑Performance LLM Serving: A Triton Backend for VLLM - Burkhard Ringlein, IBM Research & Jan van Lunteren, IBM

Wednesday April 8, 2026 11:35 - 12:00 CEST

Master Stage

Today, vLLM is the de-facto industry standard for serving Large Language Models and is widely adopted in production. However, for most of the past, vLLM’s state-of-the-art performance was largely dependent on hand-written CUDA or HIP kernels. These kernels have typically been carefully optimized for a specific GPU platform and may pose a serious obstacle to the portability of vLLM across...
See More →

Speakers

Jan van Lunteren

Senior Research Scientist, IBM Research

Jan van Lunteren is a Senior Research Scientist at IBM Research Zurich holding MSc and PhD degrees in Electrical Engineering. His research has covered a broad range of topics, including high‑speed networking, near‑memory computing, and high‑performance machine‑learning inference... Read More →

Burkhard Ringlein

Research Staff Member, IBM Research

Dr. Burkhard Ringlein is a Research Staff Member in the AI Platform team of IBM Research, based in Zurich. He is an accomplished AI systems researcher and designs, builds, debugs, and optimizes practical systems for low-latency, high-throughput machine learning applications. Currently... Read More →

2026 04 08 vanLunteren Ringlein PyTorch Paris pub pdf

Wednesday April 8, 2026 11:35 - 12:00 CEST
Master Stage

Inference & Production

Audience Level Intermediate

12:00 CEST

Attendee Lunch

Wednesday April 8, 2026 12:00 - 13:30 CEST

Open Platform

Menu | Boxed LunchVegan: (Vegetarian)-Organic green lentils from Beauce, lentil hummus, and red cabbage pickles-Chocolate cookieGluten-Free: (Vegetarian)-Organic Beauce quinoa with dried fruit, coconut yogurt with herbs-Yogurt to drinkClassic:Bulgur wheat and red lentil salad (Vegetarian)Cereal bread, poached salmon, and vegetablesOrPastrami burger with vegetable caviar and tomato sauceOrRound...
See More →

Wednesday April 8, 2026 12:00 - 13:30 CEST
Open Platform

Breaks/Exhibits/Special Events

13:00 CEST

Sponsor Activity - Lobster Trap: OpenClaw in Containers

Wednesday April 8, 2026 13:00 - 13:10 CEST

Open Platform

In this demo, we containerize OpenClaw with Docker/Podman, wire up HashiCorp Vault so secrets work identically on a laptop and in a cluster, and then deploy to K8s. With containers, one teammate's carefully built agent becomes a deployable team standard.Sponsor: Red HatLocation: Red Hat within the Community ShowcaseIn order to facilitate networking and business relationships at the event, you may...
See More →

Wednesday April 8, 2026 13:00 - 13:10 CEST
Open Platform

Breaks/Exhibits/Special Events

13:30 CEST

Lightning Talk: From Hugging Face To Handheld: Scaling LLM Deployment With LiteRT Generative API - Cormac Brick & Weiyi Wang, Google

Wednesday April 8, 2026 13:30 - 13:40 CEST

Central Room

This session will demonstrate the E2E journey of bringing custom PyTorch-based Open Source LLMs on cross platform devices using LiteRT. We will show developers how to take a custom Hugging Face Transformers checkpoint and convert them for on-device execution, including: -Taking the Pytorch model from conversion to deployment. -Automated Optimization: How LiteRT performs automated patching of...
See More →

Speakers

Cormac Brick

Principal Engineer, Google AI Edge, Google

Cormac Brick is a Principal Engineer on the Google AI Edge team, where he specializes in frameworks and on-device AI. He has over 10 years experience in AI software, silicon and systems, with work spanning AI frameworks and ecosystems and compilers down to silicon microarchitecture... Read More →

Weiyi Wang

Software Engineer, Google

Weiyi Is lead software engineer on LiteRT/TFLite, focusing on compiler, NPU and GenAI stack.

Wednesday April 8, 2026 13:30 - 13:40 CEST
Central Room

Inference & Production

Audience Level Intermediate

13:30 CEST

PyTorch on RISC-V: From Cross-Compilation To Native CI - Ludovic Henry, Meta

Wednesday April 8, 2026 13:30 - 13:55 CEST

Junior Stage

As RISC-V matures into a viable architecture for AI and data center workloads, bringing first-class PyTorch support to the ecosystem is a critical milestone. This session provides a technical deep dive into the ongoing efforts to port PyTorch natively to RISC-V, moving beyond experimental cross-compilation toward a stable, tested, and optimized environment. We detail the challenges of reconciling...
See More →

Speakers

Ludovic Henry

Software Engineering Lead, Rivos

Ludovic works at the intersection of open-source software and emerging hardware. He is a key contributor to the RISC-V ecosystem, focusing on the performance and stability of the AI stack. His recent work involves optimizing native dependencies like OpenBLAS and oneDNN and establishing... Read More →

PyTorch Conf EU 2026 PyTorch on RISC V pdf

Wednesday April 8, 2026 13:30 - 13:55 CEST
Junior Stage

Frameworks & Compilers

Audience Level Any

13:30 CEST

PyTorch Symmetric Memory + NCCL Device APIs: A New Path Towards Multi-GPU Kernels - Ke Wen & Sylvain Jeaugey, NVIDIA

Wednesday April 8, 2026 13:30 - 13:55 CEST

Master Stage

As large models shift toward inference and Mixture-of-Experts (MoE) architectures, small batch sizes and dynamic routing present new scaling challenges. Fused, customized multi-GPU kernels are emerging as the solution, but programming them for high performance remains difficult. This talk introduces a paradigm shift enabled by PyTorch Symmetric Memory and NCCL device APIs. PyTorch Symmetric...
See More →

Speakers

Ke Wen

Principal Software Architect, NVIDIA

Ke Wen works on distributed features, including Symmetric Memory, multi-GPU kernels, Expert Parallelism, inference, pipelining and graph analysis.

Sylvain Jeaugey

Distinguished Engineer, NVIDIA

Sylvain has been developing the NCCL library since its inception in 2015. He has been working on optimizing communication libraries for large parallel systems for more than 20 years.

PyTorch Symmetric Memory and NCCL Device APIs pdf

Wednesday April 8, 2026 13:30 - 13:55 CEST
Master Stage

Frameworks & Compilers

Audience Level Any

13:30 CEST

Optimizing CPU LLM Inference in PyTorch: Lessons From VLLM - Crefeda Rodrigues, Arm Limited & Fadi Arafeh, Arm

Wednesday April 8, 2026 13:30 - 13:55 CEST

Founders Cafe

vLLM has emerged as a reference inference stack in the PyTorch ecosystem for high-throughput large language model serving. CPUs continue to play an important role in LLM inference, supporting cost-sensitive deployments, hybrid CPU/GPU serving, and batch or off-peak workloads on general-purpose infrastructure. In this talk, we examine CPU-based LLM inference through the lens of PyTorch internals,...
See More →

Speakers

Crefeda Rodrigues

Staff Software Engineer, Arm

Crefeda Rodrigues is a Staff Software Engineer at Arm, focusing on performance and scalability driven machine learning software optimization for Arm server CPUs. She previously worked on large-scale climate and weather model optimization as a postdoctoral researcher at the University... Read More →

Fadi Arafeh

Senior Machine Learning Engineer, Arm

Fadi is a Senior Machine Learning Engineer at Arm, working on optimizing PyTorch and vLLM for Arm Infrastructure cores. Prior to that, Fadi obtained a BSc in Artificial Intelligence from the University of Manchester.

Optimizing LLM Inference on CPU Crefeda Fadi pdf

Wednesday April 8, 2026 13:30 - 13:55 CEST
Founders Cafe

Inference & Production

Audience Level Intermediate
Slides Attached Yes

13:45 CEST

Lightning Talk: Slash LLM Cold-Start Times by Pre-distributing GPU Caches - Billy McFall & Maryam Tahhan, Red Hat

Wednesday April 8, 2026 13:45 - 13:55 CEST

Central Room

Are your Large Language Model (LLM) deployments stuck waiting for GPU kernels to compile? If you are running distributed inference at scale, your infrastructure is likely wasting time rebuilding the same GPU Kernel Cache for every single instance. You may not even realize the time and resources that are being consumed for rebuilding. This session is designed for platform engineers and ML...
See More →

Speakers

Billy McFall

Sr. Principal Software Engineer, Red Hat

Billy McFall is a software engineer in the Emerging Tech Networking Team within the Office of the CTO at Red Hat for 9+ years. Billy previously worked on Kubernetes/OpenShift networking, including the integration of the NVIDIA DPU into OpenShift. Billy has also been a maintainer of... Read More →

Maryam Tahhan

Principal Engineer, Red Hat

Maryam is a Principal Engineer in Red Hat's Office of the CTO, where she focuses on standardising CPU inferencing performance evaluation to help effectively validate and scale ML workloads.

Slash LLM Cold Start Times by Pre distributing GPU Caches pdf

Wednesday April 8, 2026 13:45 - 13:55 CEST
Central Room

Inference & Production

Audience Level Intermediate
Slides Attached Yes

14:00 CEST

Lightning Talk: Pluggable PyTorch LLM Inference Architecture With VLLM and AWS Neuron Backends - Yahav Biran, Annapurna Labs & Maen Suleiman, Amazon

Wednesday April 8, 2026 14:00 - 14:10 CEST

Junior Stage

As PyTorch-based LLM serving matures, the challenge shifts from monolithic inference stacks to integrating diverse hardware accelerators efficiently. This session explores how modular plugin architectures enable PyTorch models to run optimally across backends—demonstrating AWS Trainium integration into vLLM through standardized interfaces. We'll examine how vLLM's Hardware Plugin architecture...
See More →

Speakers

Maen Suleiman

Product Manager, Amazon

Yahav Biran

Principal Architect, Amazon

Yahav Biran is a Principal Architect at AWS, focusing on large-scale AI workloads. He contributes to open-source projects and publishes in AWS blogs and academic journals, including the AWS compute and AI blogs and the Journal of Systems Engineering. He frequently delivers technical... Read More →

Pluggable PyTorch LLM pytorch 26 pdf

Wednesday April 8, 2026 14:00 - 14:10 CEST
Junior Stage

Inference & Production

Audience Level Intermediate

14:00 CEST

Lightning Talk: Backpropagation-Free Optimization in PyTorch - Andrii Krutsylo, Polish Academy of Sciences

Wednesday April 8, 2026 14:00 - 14:10 CEST

Central Room

Backpropagation is not the only mechanism for training deep networks. This talk presents a compact, implementation-driven map of backpropagation-free training methods, organized around representative algorithms that expose key design trade-offs. We focus on four families: Difference Target Propagation (target-based credit assignment), Direct Feedback Alignment (random feedback without weight...
See More →

Speakers

Andrii Krutsylo

PhD Candidate, Institute of Computer Science, Polish Academy of Sciences

Andrii Krutsylo is a deep learning researcher focusing on continual learning and optimization dynamics. His work studies experience replay, gradient-free and local learning rules, and structured optimization for adaptive, resource-efficient systems.

Wednesday April 8, 2026 14:00 - 14:10 CEST
Central Room

Training Systems

Audience Level Intermediate

14:00 CEST

Lightning Talk: Debugging the Undebuggable: Introducing Torch.distributed.debug - Tristan Rice, Meta, PyTorch

Wednesday April 8, 2026 14:00 - 14:10 CEST

Founders Cafe

Distributed training in PyTorch enables unprecedented scale, but it also introduces notoriously difficult debugging challenges. When a job with thousands of ranks hangs or slows down, identifying the root cause can feel like searching for a needle in a haystack. This lightning talk introduces the new PyTorch Distributed Debug Server, a powerful, interactive tool designed to bring clarity and...
See More →

Speakers

Tristan Rice

Software Engineer, PyTorch Distributed, Meta

Software engineer working on PyTorch Distributed and large scale training.

Wednesday April 8, 2026 14:00 - 14:10 CEST
Founders Cafe

Training Systems

Audience Level Intermediate

14:00 CEST

Deploying PyTorch Models To the Browser and Beyond With Transformers.js - Joshua Lochner, Hugging Face

Wednesday April 8, 2026 14:00 - 14:25 CEST

Master Stage

This session presents a comprehensive engineering roadmap for running Hugging Face Transformers entirely locally in your web browser using Transformers.js. We will explore the end-to-end pipeline required to export, optimize, and deploy PyTorch models to the web, leveraging emerging web technologies like WebGPU for efficient, cross-platform inference. We will dive into the technical nuances of...
See More →

Speakers

Joshua Lochner

Creator of Transformers.js, Hugging Face

Bringing the power of machine learning to the web. Currently working on Transformers.js (@huggingface 🤗)

Wednesday April 8, 2026 14:00 - 14:25 CEST
Master Stage

Inference & Production

Audience Level Any

14:15 CEST

Lightning Talk: Distributed AI Without the Infrastructure Tax - Yahav Biran, Annapurna Labs & Maen Suleiman, Amazon

Wednesday April 8, 2026 14:15 - 14:25 CEST

Junior Stage

Running distributed AI workloads in production requires solving three problems: package compatibility, hardware abstraction, and network configuration. AWS Neuron Deep Learning Containers (DLCs) address all three by providing open-source, production-ready images for Trainium and Inferentia. This lightning talk shows how DLCs eliminate common failure modes. We'll cover three layers: First, how DLCs...
See More →

Speakers

Maen Suleiman

Product Manager, Amazon

Yahav Biran

Principal Architect, Amazon

Distributed AI Without the Infrastructure Tax pytorch 26 pdf

Wednesday April 8, 2026 14:15 - 14:25 CEST
Junior Stage

Inference & Production

Audience Level Intermediate

14:15 CEST

Lightning Talk: Inside VLLM's KV Offloading Connector: Async Memory Transfers for Higher Inference Throughput - Nicolò Lucchesi, Red Hat

Wednesday April 8, 2026 14:15 - 14:25 CEST

Central Room

Every LLM request produces KV-cache state that is expensive to recompute. However, GPU memory is limited in size and when memory fills up, entries are discarded from cache. A natural mitigation is expanding the KV cache to CPU DRAM which is meaningfully larger than GPU memory. vLLM 0.11.0 introduced the Offloading Connector - an asynchronous, pluggable API for KV-cache offloading which is bundled...
See More →

Speakers

Nicolò Lucchesi

Senior Machine Learning Engineer, Red Hat

Nicolò is a Senior Machine Learning Engineer at Red Hat with a background in Deep Learning and Computer Vision. He works on Inference Optimization for vLLM, where he is a maintainer.

[Pytorch Europe 26] Native KV Cache Offloading in vLLM pdf

Wednesday April 8, 2026 14:15 - 14:25 CEST
Central Room

Inference & Production

Audience Level Any
Slides Attached Yes

14:15 CEST

Lightning Talk: Scaling Recommendation Systems To 2K GPUs and Beyond - Zain Huda, Meta

Wednesday April 8, 2026 14:15 - 14:25 CEST

Founders Cafe

TLDR: In this session, we go over one of the key technologies to Ads model scaling at Meta, 2D sparse parallelism. Which scales sparse recommendation embedding tables beyond 1k GPUs to 8k GPUs - enabling the largest Ads model training runs in production at Meta. Scaling Laws have dominated LLMs and shown the industry we can achieve better model performance through scaling. The same scaling law...
See More →

Speakers

Zain Huda

Software Engineer, Meta

Zain works on large scale training systems for recommender systems at Meta. He works on TorchRec, a library for distributed parallelism for sparse recommender models. He is also one of the authors of 2D sparse parallelism.

Wednesday April 8, 2026 14:15 - 14:25 CEST
Founders Cafe

Training Systems

Audience Level Intermediate

14:30 CEST

Lightning Talk: Torch-Spyre: Compiling To a Multi-core Dataflow Accelerator With Inductor - David Grove & Olivier Tardieu, IBM

Wednesday April 8, 2026 14:30 - 14:40 CEST

Junior Stage

Torch-Spyre (https://github.com/torch-spyre/torch-spyre) is an open source project that provides a PyTorch PrivateUse1 device with OpenReg, including an Inductor backend, for the IBM Spyre Accelerator. IBM Spyre is a high-performance energy-efficient AI accelerator featuring 32 AI-optimized compute cores each with on-chip interconnect and compiler-managed scratchpad memory. Our goal in this...
See More →

Speakers

Dave Grove

Distinguished Research Scientist, IBM

Olivier Tardieu

Principal Research Scientist, Manager, IBM

PyTorchConEU2026 TorchSpyre v7 pdf

Wednesday April 8, 2026 14:30 - 14:40 CEST
Junior Stage

Frameworks & Compilers

Audience Level Intermediate
Slides Attached Yes

14:30 CEST

Lightning Talk: Every Millisecond Counts: The Fine-tuning Journey of an Ultra-Efficient PyTorch Model for the Edge - Pavel Macenauer, NXP Semiconductors

Wednesday April 8, 2026 14:30 - 14:40 CEST

Central Room

From smart cameras that protect privacy by analyzing video on-device, to wearables that interpret voice and motion instantly, to industrial sensors that prevent failures before they happen, edge AI is shaping our everyday routines and transforming our lives. Eliminating cloud dependency and making connectivity optional is essential for data staying local. Without cloud, our options become...
See More →

Speakers

Pavel Macenauer

AI/ML R&D Software Lead, NXP Semiconductors

A software lead at NXP Semiconductors leading teams developing tools, runtime libraries, and enabling AI on Edge-class devices. Both professionally and out of human curiosity, Pavel developed software visualizing the World around us. Initially through the lens of a camera, then from... Read More →

Every Millisecond Counts export pdf

Wednesday April 8, 2026 14:30 - 14:40 CEST
Central Room

Inference & Production

Audience Level Intermediate
Slides Attached Yes

14:30 CEST

Seamless Integration: Custom Kernels in the Torch.compile Stack Without Graphbreaks - Kshiteej Kalambarkar, Masaki Kozuki & Pawel Gadzinski, NVIDIA

Wednesday April 8, 2026 14:30 - 14:55 CEST

Master Stage

Custom kernels are essential for high-performance PyTorch workflows, but their integration often comes with a hidden cost. While torch.compile promises speedups, calling custom operations typically triggers graph-breaks: fallbacks to Eager mode that introduce overhead and negate your performance gains. In this session, we provide a practical roadmap for making your extensions "compiler-aware"....
See More →

Speakers

Kshiteej Kalambarkar

Software Engineer Frameworks, NVIDIA

Kshiteej Kalambarkar is a software engineer at NVIDIA specializing in PyTorch and compiler technologies, with experience in torch.compile and custom kernel integration

Masaki Kozuki

Software Engineer, NVIDIA

Masaki Kozuki is working at NVIDIA on PyTorch.

Pawel Gadzinski

Senior Performance Engineer - Deep Learning, NVIDIA

Pawel Gadzinski is a Deep Learning Performance Engineer at NVIDIA, where he works on the Transformer Engine library, enabling state-of-the-art techniques for accelerating transformer models on NVIDIA GPUs, with a focus on low-precision training.

Wednesday April 8, 2026 14:30 - 14:55 CEST
Master Stage

Frameworks & Compilers

Audience Level Intermediate

14:30 CEST

From Responses To Trajectories: Multi-Turn and Multi-Environment Reinforcement Learning - Kashif Rasul & Sergio Paniego Blanco, Hugging Face

Wednesday April 8, 2026 14:30 - 14:55 CEST

Founders Cafe

Post-training of LLMs with reinforcement learning is increasingly moving beyond static prompt–response pairs and preference optimization methods such as DPO, toward trajectory-based optimization. This talk focuses on the latest advances in multi-turn and multi-environment GRPO training, enabling LLMs to learn from interactive, agent-like experiences, including interacting with simulated...
See More →

Speakers

Kashif Rasul

Research Scientist, Hugging Face

Kashif has a PhD. in Mathematics from the Freie Universität Berlin. He is passionate about high-performance computing, Reinforcement learning, and has presented at NVIDIA's GTC in 2009 and at StrangeLoop in 2012, and is also contributing to a number of data science and deep learning... Read More →

Sergio Paniego Blanco

Machine Learning Engineer, Hugging Face

Sergio tiene una amplia trayectoria en el ámbito del código abierto y la inteligencia artificial, campo en el que también obtuvo su doctorado. Lleva más de ocho años participando en iniciativas como Google Summer of Code, donde ha contribuido como desarrollador y mentor. Actualmente... Read More →

From Responses To Trajectories Multi Turn and Multi Environment Reinforcement Learning pdf

Wednesday April 8, 2026 14:30 - 14:55 CEST
Founders Cafe

Training Systems

Audience Level Intermediate
Slides Attached Yes

14:45 CEST

Lightning Talk: Building a PyTorch‑native VLLM Plugin for IBM Spyre - Thomas Parnell, IBM Research & Thomas Ortner, IBM Research Europe - Zurich

Wednesday April 8, 2026 14:45 - 14:55 CEST

Junior Stage

IBM Spyre is an AI accelerator used across IBM Z and Power systems for agentic inference in production. Today, we serve models on Spyre using upstream vLLM together with an out-of-tree platform plugin. While the current plugin delivers crucial functionality for our business, it re-uses relatively little of upstream vLLM’s capabilities, and also carries a high maintenance cost. In this talk, we...
See More →

Speakers

Thomas Parnell

Principal Research Scientist, IBM Research

Thomas received his B.Sc. and Ph.D. degrees in mathematics from the University of Warwick. U.K., in 2006 and 2011, respectively. He began his career in the field of EDA, working at Arithmatica and Siglead before joining IBM Research in 2013. During his time at IBM, Thomas has worked... Read More →

Thomas Ortner

Research Scientist, IBM Research Europe - Zurich

Thomas Ortner is a Research Scientist at IBM Research Europe, Switzerland, in the group of Emerging Computing and Circuits. He holds a PhD and a MSc in Computer Science, a MSc degree in Technical Physics and a MSc degree in Software Engineering and Management from Graz University... Read More →

Wednesday April 8, 2026 14:45 - 14:55 CEST
Junior Stage

Frameworks & Compilers

Audience Level Intermediate

14:45 CEST

Lightning Talk: Full-Stack PyTorch Robotics VLA: From Data To Edge Via ExecuTorch/OpenVINO - Samet Akcay & Dmitriy Pastushenkov, Intel

Wednesday April 8, 2026 14:45 - 14:55 CEST

Central Room

While research-centric tools have lowered the entry barrier for robotics data collection, transitioning Vision-Language-Action models to production remains challenging due to fragmented edge deployment paths. This session presents a unified, PyTorch-native workflow spanning the full robotics lifecycle, from data capture and curation to optimized edge execution. We introduce a modular Physical AI...
See More →

Speakers

Dmitriy Pastushenkov

AI Software Product Manager, Intel

Dmitriy Pastushenkov is a passionate Software Product Manager at Intel with more than 20 years of comprehensive and international experience in the industrial automation, industrial Internet of Things (IIoT) and real-time operating systems and AI. Dmitriy has held various roles in... Read More →

Samet Akcay

Principal AI Engineer, Intel

Samet Akcay is a Principal AI Engineer at Intel who leads ML R&D efforts across Open Edge Platform libraries, including Intel Geti, Datumaro, Anomalib, Training Extensions, and Inference libraries. His research specializes self-supervised learning and multi-modal object detection... Read More →

Full Stack PyTorch Robotics VLA PyTorch Conf EU2026 pdf

Wednesday April 8, 2026 14:45 - 14:55 CEST
Central Room

Inference & Production

Audience Level Any
Slides Attached Yes

14:55 CEST

Birds of A Feather: NCCL in the Wild: Scaling Communications To Thousands of GPUs - Jeff Hammond, Gabrielle Talavera, Ke Wen & Asma Farjallah, NVIDIA

Wednesday April 8, 2026 14:55 - 15:20 CEST

Open Platform

We will share the latest updates to NCCL and how they can be used in PyTorch. We invite the community to share their feedback on challenges using NCCL at scale and ways to improve integration of NCCL with PyTorch applications. Some of the important topics for community discussion include: - Symmetric memory support and GPU-initiated networking. - Copy-engine collectives and maximizing overlap of...
See More →

Speakers

Asma Farjallah

AI DevTech, NVIDIA

Asma Farjallah is an AI Developer Technology Engineer at NVIDIA. Prior to her role as DevTech, she was part of the Solution Architect team at NVIDIA for 5 years and was part of the global energy team. Before joining NVIDIA, Asma worked for Intel for 4 years as an Application Engineer... Read More →

Gabrielle US

Product Manager, NVIDIA

Gabrielle Talavera is the Product Manager for NCCL at NVIDIA, focused on shaping the product roadmap and improving the experience of teams building on GPU‑accelerated software. She joined NVIDIA in 2021 as a Solutions Architect, helping customers adopt NVIDIA software and debug... Read More →

Jeff Hammond

Distinguished Engineer, NVIDIA Helsinki Oy

Jeff Hammond is a Distinguished Engineer in the NCCL team at NVIDIA focused on user education and research outreach. His background is in parallel application and algorithm development, open-source software, and supercomputing architecture. Jeff has made significant contributions... Read More →

Ke Wen

Principal Software Architect, NVIDIA

Ke Wen works on distributed features, including Symmetric Memory, multi-GPU kernels, Expert Parallelism, inference, pipelining and graph analysis.

Wednesday April 8, 2026 14:55 - 15:20 CEST
Open Platform

Birds of A Feather

Audience Level Advanced

14:55 CEST

Coffee Break

Wednesday April 8, 2026 14:55 - 15:25 CEST

Open Platform

Menu:
-Lemon cake
-Caramelized arlette
-Seasonal fruits (GF, Vegan)
-Roasted pumpkin cake
-Dry fruits and dry grapes mix (GF, Vegan)
-Chocolate Cookie (GF, Vegan)

Wednesday April 8, 2026 14:55 - 15:25 CEST
Open Platform

Breaks/Exhibits/Special Events

14:55 CEST

Meet the Ray Maintainers

Wednesday April 8, 2026 14:55 - 15:25 CEST

Open Platform

Meet the core maintainers of Ray at this session! Come and discuss use cases, features, roadmap with us, or just learn how the Ray development happens under the hood.

Speakers

Artur Niederfahrenhorst

Member of Technical Staff, Anyscale

Wednesday April 8, 2026 14:55 - 15:25 CEST
Open Platform

Meet the Developers

Audience Level Any

15:25 CEST

Lightning Talk: Trinity Large - Torchtitan on 2000+ B300s - Matej Sirovatka, Prime Intellect

Wednesday April 8, 2026 15:25 - 15:35 CEST

Founders Cafe

In this talk, we'll cover how to use torchtitan to scale training of ultra-sparse mixture-of-experts models across over 2,000 GPUs. We'll walk through the pre-training of Trinity Large, a 400B mixture-of-experts model trained entirely using torchtitan, focusing on maximizing throughput and minimizing the impact of hardware induced failures. Along the way, we'll discuss challenges like fault...
See More →

Speakers

Matej Sirovatka

Research Engineer, Prime Intellect

Research Engineer at Prime Intellect, mainly focusing on distributed training, performance and scaling.

PTC 2026 Trinity Large & torchtitan (1) pdf

Wednesday April 8, 2026 15:25 - 15:35 CEST
Founders Cafe

Training Systems

Audience Level Intermediate
Slides Attached Yes

15:25 CEST

Bridging the Hardware Gap With Code Harnesses on the Hugging Face Kernels Hub - Ben Burtenshaw, Hugging Face

Wednesday April 8, 2026 15:25 - 15:50 CEST

Master Stage

What: We share experiments and tooling to standardise kernel writing for agentic coding. We present an end-to-end experiment benchmarking 6 harnesses across 10 models on CUDA and Metal kernel writing. We compare agent cost, kernel latency, VRAM usage, and end inference performance, and show how the Kernels Hub enables distribution at scale. We demo two tools: Kernels Hub: Infrastructure for...
See More →

Speakers

Ben Burtenshaw

Community, Hugging Face

Ben Burtenshaw is an MLE in the Hugging Face open source community team, specializing in agents, LLMs, and fine-tuning. He leads the development of open-source educational initiatives like the Agents Course, the MCP Course, and the LLM Course, which bridge the gap between complex... Read More →

Wednesday April 8, 2026 15:25 - 15:50 CEST
Master Stage

Agents & Interop

Audience Level Intermediate

15:25 CEST

Beyond the Theory: What Actually Breaks When You Scale Your Disaggregated Pytorch Models - Ekin Karabulut & Ron Kahn, NVIDIA

Wednesday April 8, 2026 15:25 - 15:50 CEST

Central Room

As inference demand explodes, new techniques to optimize these deployments have emerged. One such technique is disaggregated inference, which splits inference into differently optimized workloads (e.g. prefill and decode) on separate workers. The theory is straightforward–better GPU utilization, inference performance, and tighter control over SLAs.The deployment in production is not. Scaling...
See More →

Speakers

Ekin Karabulut

AI/ML Developer Advocate, NVIDIA

Ekin is a Developer Advocate at NVIDIA, following the acquisition of Run:ai. Prior to that, she specialized in the privacy implications of federated learning systems with DNNs in distributed environments as a data scientist. Currently, she is exploring the efficient usage of large... Read More →

Ron Kahn

Senior Software Engineer, NVIDIA

Ron Kahn is a Senior Software Engineer in the NVIDIA Run:ai platform team. Ron works on the design and implementation of workload management systems that abstract Kubernetes complexity for AI practitioners. When not simplifying AI training jobs, Ron can be found cooking something... Read More →

Beyond the Theory What Actually Breaks When You Scale Your Disaggregated Pytorch Models.pptx pdf

Wednesday April 8, 2026 15:25 - 15:50 CEST
Central Room

Inference & Production

Audience Level Any
Slides Attached Yes

15:25 CEST

Building Trust for Users and Regulators Alike: A Cost-Efficient PyTorch Path To Compliance-as-Code - Raja Gopal Hari Vijay, Zoho Corporation

Wednesday April 8, 2026 15:25 - 15:50 CEST

Junior Stage

Traditional compliance relies on retroactive logs and manually stitched audit trails, while Opacus, CrypTen, and Captum address isolated concerns without providing end-to-end lifecycle traceability. Compliance-as-Code embeds regulatory controls as executable logic within training and inference pipelines, turning compliance into a continuous engineering function and reducing audit costs. ...
See More →

Speakers

Raja Gopal Hari Vijay -

Member Leadership Staff, Zoho Corporation

At Zoho, Raja builds large-scale Video AI (CCTV analytics, edge inference, privacy-aware deployments) on PyTorch, drives green computing via custom accelerators and FPGAs, and owns a custom Linux distribution for Zoho products and agentic workflows with security reasoning across LSM... Read More →

Wednesday April 8, 2026 15:25 - 15:50 CEST
Junior Stage

Responsible AI & Compliance

Audience Level Any

15:40 CEST

Lightning Talk: Faster Than SOTA Kernels in Torch.compile With Subgraph Fusions and Custom Op Autotuning - Elias Ellison & Paul Zhang, Meta

Wednesday April 8, 2026 15:40 - 15:50 CEST

Founders Cafe

Unlocking state-of-the-art performance, this talk reveals how subgraph and custom operator autotuning in torch.compile deliver breakthrough speedups—surpassing previous SOTA for matmul and distributed collective ops. DecomposeK is a novel subgraph optimization in PyTorch, designed to accelerate matrix multiplication when the inner dimension (K) is very large. DecomposeK achieves, delivering up...
See More →

Speakers

Elias Ellison

Software Engineer, Meta

Elias has been working on the PyTorch team for four years, most recently on the torch.compile stack

Paul Zhang

Software Engineer, Meta

Wednesday April 8, 2026 15:40 - 15:50 CEST
Founders Cafe

Frameworks & Compilers

Audience Level Intermediate

15:55 CEST

Lightning Talk: Why Logging Isn’t Enough: Making PyTorch Training Regressions Visible in Practice - Sahana Venkatesh, Wayve

Wednesday April 8, 2026 15:55 - 16:05 CEST

Central Room

PyTorch teams often log rich training metrics, yet still discover training regressions late after significant developer time and GPU budget have already been spent. In this talk, I’ll share a practical pattern we used to turn PyTorch training metrics into an operational guardrail for large-model training.The approach combines scheduled short and long training runs, standardized performance and...
See More →

Speakers

Sahana Venkatesh

Software engineer, Wayve

Making PyTorch Training Regressions Visible in Practice pdf

Wednesday April 8, 2026 15:55 - 16:05 CEST
Central Room

Training Systems

Audience Level Intermediate
Slides Attached Yes

15:55 CEST

From Gradients To Governance: Making PyTorch Lineage-Aware - Kateryna Romashko & Clodagh Walsh, Red Hat

Wednesday April 8, 2026 15:55 - 16:20 CEST

Master Stage

PyTorch was built to track how models learn, but not whether they should have. As AI systems increasingly operate on regulated, jurisdiction bound, and sovereign data, lineage and policy can no longer live outside the runtime. This talk explores data sovereignty as a first class constraint and argues that lineage is the missing primitive in modern ML frameworks. Building on PyTorch’s dynamic...
See More →

Speakers

Kateryna Romashko

Associate Software Engineer, RedHat

Kateryna Romashko is a Software Engineer and a Master’s student in Computer Science, currently working in the Emerging Technology team at Red Hat. Her work focuses on ML systems, data lineage, and event-driven architectures, with hands-on experience across ML platforms, distributed... Read More →

Clodagh Walsh

Software Engineer, Red Hat

Clodagh is a software engineer at Red Hat working on the Emerging Technologies team under the office of the CTO. She has experience working with cloud native technologies. She is currently working on a range of AI related projects focused on topics such as MLOps and dLLMs.

Final Deck From Gradients to Governance Making PyTorch Lineage Aware pdf

Wednesday April 8, 2026 15:55 - 16:20 CEST
Master Stage

Responsible AI & Compliance

Audience Level Intermediate
Slides Attached Yes

15:55 CEST

DualPipe from Scratch: Implementing DeepSeek's 5D Parallelism in PyTorch - Dev Jadhav, ING Bank

Wednesday April 8, 2026 15:55 - 16:20 CEST

Founders Cafe

The DeepSeek-V3 paper describes 5D parallelism and DualPipe at a high level, but leaves critical implementation details undocumented. This session presents our open-source PyTorch reference implementation that fills those gaps - verified against the original architecture and designed for learning and extension.We'll share what we discovered building it from scratch:Why K_pe is shared across heads...
See More →

Speakers

Dev Jadhav

Tech Lead ML Engineer, ING Bank

Dev Jadhav is a production AI/ML engineer with 10+ years building AI
systems at scale. He currently leads ML engineering at Major Bank,
developing financial-grade AI and large-scale model operations. Dev is
the creator of DeepSeek From Scratch, an open-source implementation of
DeepSe... Read More →

DualPipe PyTorch Conference pdf

Wednesday April 8, 2026 15:55 - 16:20 CEST
Founders Cafe

Training Systems

Audience Level Advanced
Slides Attached Yes

15:55 CEST

Sponsored Session: Fault-Tolerant Training: How We Build Reliable Clusters for Distributed AI Workloads - Cyril Konkratenko & Maurits de Groot, Nebius

Wednesday April 8, 2026 15:55 - 16:20 CEST

Junior Stage

Large-scale distributed AI training is highly sensitive to infrastructure failures, where even a single node disruption can halt progress and waste substantial compute. This talk presents Nebius’s approach to fault-tolerant training, combining reliability metrics such as goodput, MTBF, and MTTR with automated infrastructure practices including health checks, workload isolation, node replacement,...
See More →

Speakers

Cyril Kondratenko

AI/ML Specialist Solutions Architect, Nebius

Maurits de Groot

AI/ML Specialist Solutions Architect, Nebius

Wednesday April 8, 2026 15:55 - 16:20 CEST
Junior Stage

Training Systems

16:10 CEST

Lightning Talk: Ball Tracking and Detection in Soccer Videos - Comparison of VLMs and Traditional Pipelines - Maciej Szymkowski, Future Processing

Wednesday April 8, 2026 16:10 - 16:20 CEST

Central Room

Nowadays, Vision-Language Models (VLMs) have plenty of different applications. However, it must be pointed out that we cannot be totally sure that they are the most accurate and precise solution for all potential problems. We must compare their possibilities with some other pipelines. In this presentation, we would like to compare on-premise models – Qwen 3 and InternVL-3.5, and cloud-based...
See More →

Speakers

Maciej Szymkowski

AI Researcher and Senior Machine Learning Engineer, Future Processing

Maciej Szymkowski, PhD, is a Senior ML Engineer at Future Processing. Formerly Head of AI at Łukasiewicz PIT, his academic background spans BUT, WUT, and AGH. With 45+ publications, he specializes in Computer Vision (med/transport/sport), VLMs, and LLMs. His industry experience includes... Read More →

M Szymkowski PyTorch Conference pdf

Wednesday April 8, 2026 16:10 - 16:20 CEST
Central Room

Applications & Case Studies

Audience Level Intermediate
Slides Attached Yes

16:25 CEST

Lightning Talk: Bridging the Gap: Engineering Compliant "Glass Box" Medical AI With PyTorch - Muhammad Saqib Hussain, Neurosonic & Mohaddisa Maryam, Neurosonic Academy

Wednesday April 8, 2026 16:25 - 16:35 CEST

Founders Cafe

While state-of-the-art models like NeuroBOLT demonstrate mathematical excellence in EEG-to-fMRI synthesis, they often remain clinically opaque. With the EU AI Act classifying medical AI as "high-risk," hospitals cannot deploy "black boxes"; they require systems that are transparent, auditable, and legally compliant. This session presents a "Clinical Auditing System" built within the PyTorch...
See More →

Speakers

Mohaddisa Maryam

Miss, Neurosonic Academy

I am a First Year Student of Medicine in Italy.

Muhammad Saqib Hussain

Medical Student, AI Researcher and Neurotech Founder, ClinExplain

Muhammad Saqib is a 4th-year medical student at Comenius University Bratislava and Founder of Neurosonic Academy. His M.D. thesis explores AI for Sleep Medicine. Leveraging PyTorch and Captum, he builds "Glass Box" auditing frameworks to validate generative neuroimaging models against... Read More →

Pytorch Conf 2026 Muhammad (2) pdf

Wednesday April 8, 2026 16:25 - 16:35 CEST
Founders Cafe

Applications & Case Studies

Audience Level Intermediate
Slides Attached Yes

16:25 CEST

De-mystifying PyTorch for ASICs: When (and Why) To Move Your Development To AI Accelerators - Alpha Romer Coma, Kollab Philippines

Wednesday April 8, 2026 16:25 - 16:50 CEST

Central Room

GPU availability and cost are squeezing ML teams, making ASICs like Google TPUs and AWS Trainium attractive alternatives. But does the software stack hold up? This session moves beyond the datasheets to provide a practical, code-first reality check on migrating PyTorch workloads to ASICs. We will de-mystify the underlying compiler stacks, comparing PyTorch/XLA (TPU) and TorchNeuron (Trainium),...
See More →

Speakers

Alpha Romer Coma

Associate Engineer, Cloud Development, Kollab Philippines

Alpha is an Associate Cloud Engineer in Kollab and a CS undergraduate at FEU Tech, Philippines. He specializes in multimodality with text, videos, and audio, and works on Accelerated Computing with Google TPUs and AWS Tranium.

For 5 months, he pushed Google Cloud TPUs v4s to their limit to train vision-language models for use cases like internet brain rot recognition and detection of cognitively overloading content called sludge videos with 92% accuracy... Read More →

Wednesday April 8, 2026 16:25 - 16:50 CEST
Central Room

Frameworks & Compilers

Audience Level Intermediate