Loading…
7-8 April, 2025
Paris, France
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference Europe 2026 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in CEST (UTC/GMT +2). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."
Wednesday April 8, 2026 14:00 - 14:10 CEST


Distributed training in PyTorch enables unprecedented scale, but it also introduces notoriously difficult debugging challenges. When a job with thousands of ranks hangs or slows down, identifying the root cause can feel like searching for a needle in a haystack. This lightning talk introduces the new PyTorch Distributed Debug Server, a powerful, interactive tool designed to bring clarity and control to the chaos of distributed debugging. We will provide a high-level overview of its architecture and core features, demonstrating how it provides a unified interface to inspect stack traces, analyze performance, and diagnose hangs across all workers simultaneously. Attendees will learn how this extensible server can dramatically reduce debugging time and improve the reliability of large-scale training jobs.
Speakers
avatar for Tristan Rice

Tristan Rice

Software Engineer, PyTorch Distributed, Meta
Software engineer working on PyTorch Distributed and large scale training.
Wednesday April 8, 2026 14:00 - 14:10 CEST
Founders Cafe

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link