Loading…
7-8 April, 2025
Paris, France
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference Europe 2026 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in CEST (UTC/GMT +2). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."
Tuesday April 7, 2026 15:55 - 16:05 CEST


End-to-end observability is non-negotiable for production LLMs to track performance, attribute costs, and validate optimizations. Generating actionable traces from complex distributed inference remains a significant challenge.

We implemented tracing for llm-d, a high-performance distributed LLM inference framework. Using manual OpenTelemetry instrumentation with carefully crafted spans at critical paths, we expose insights that generic tooling can't capture.

This talk explores how distributed tracing illuminates requests through unique inference scenarios:

* Prefix cache-aware routing: Track cache hits and validate whether intelligent scheduling improves TTFT
* Prefill/decode disaggregation: Analyze why each request chose split vs unified processing based on cache locality.
* Wide expert-parallelism: Profile MoE models across multi-node deployments
* Workload autoscaling: Correlate request patterns with scaling decisions

Attendees will learn why LLMOps requires a new approach to distributed tracing, contrasting it with traditional microservices, and how to instrument inference stacks effectively. Walk away ready to add meaningful observability to your own deployments.
Speakers
avatar for Greg Pereira

Greg Pereira

Sr. Machine Learning Engineer, Red Hat
Greg began his career as SRE focusing on CICD and automation in the Emerging Technologies org at redhat. After transferring to the platform and services team he started from the ground up, refocusing on AI centric software development. Three years later he has been involved in building... Read More →
avatar for Sally O'Malley

Sally O'Malley

Principal Software Engineer, Red Hat

Tuesday April 7, 2026 15:55 - 16:05 CEST
Master Stage

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Share Modal

Share this link via

Or copy link