Name: Lightning Talk: Cross-Region Model Serving: PyTorch Inference, Observability & LLMOps - Suraj Muraleedharan, Amazon Web Services
Start: 2026-04-07T15:40:00+0200
End: 2026-04-07T15:50:00+0200

7-8 April, 2025
Paris, France
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference Europe 2026 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in CEST (UTC/GMT +2). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

Lightning Talk: Cross-Region Model Serving: PyTorch Inference, Observability & LLMOps - Suraj Muraleedharan, Amazon Web Services

Tuesday April 7, 2026 15:40 - 15:50 CEST

Founders Cafe

As PyTorch models move to production, organizations face a critical challenge: deploying, monitoring, and operating inference at scale across multiple regions. Single-region serving is well-understood, but multi-region LLMOps—model distribution, observability, failover, and cost management—remains ad-hoc and challenging for multiple customers.

This session presents production-tested architectures for multi-region PyTorch inference and LLMOps workflows. We cover:

Serving: Multi-region TorchServe/KServe on Kubernetes with latency-based routing, blue-green deployments, model versioning, and automated failover with circuit breakers.

Observability: OpenTelemetry distributed tracing, Prometheus/Grafana dashboards for latency, throughput, GPU utilization, and LLM-specific metrics like time-to-first-token and KV-cache hit rate.

LLMOps: CI/CD pipelines for cross-region model deployment with automated rollback, drift detection, and SLO-based alerting.

Attendees leave with serving architectures, dashboards, and deployment pipelines using open-source tooling.

Speakers

Suraj Muraleedharan

Principal Platform Engineer, Amazon Web Services

Principal Engineer driving technical strategy and building mission-critical foundational platforms for AI, HPC, and distributed systems, bridging the gap between infrastructure, AI research, and product organizations.

PyTorchCon EU26 Cross Region Model Serving pdf

Tuesday April 7, 2026 15:40 - 15:50 CEST
Founders Cafe

Inference & Production

Audience Level Advanced
Slides Attached Yes

PyTorch Conference Europe 2026

Suraj Muraleedharan

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Get help with the event