Name: Lightning Talk: Enabling the Audio Modality for Language Models - Eustache Le Bihan, Hugging Face
Start: 2026-04-08T11:35:00+0200
End: 2026-04-08T11:45:00+0200

7-8 April, 2025
Paris, France
View More Details & Registration
Note: The schedule is subject to change.

The Sched app allows you to build your schedule but is not a substitute for your event registration. You must be registered for PyTorch Conference Europe 2026 to participate in the sessions. If you have not registered but would like to join us, please go to the event registration page to purchase a registration.

This schedule is automatically displayed in CEST (UTC/GMT +2). To see the schedule in your preferred timezone, please select from the drop-down menu to the right, above "Filter by Date."

Lightning Talk: Enabling the Audio Modality for Language Models - Eustache Le Bihan, Hugging Face

Wednesday April 8, 2026 11:35 - 11:45 CEST

Founders Cafe

As the maintainer of everything audio in `transformers` (the lib), this talk shares how audio is being integrated into large language models, grounded in what we observe from the OS ecosystem.

Beginning with a brief overview of the current landscape of Audio LMs, I'll then highlight emerging trends in how audio is incorporated into pretrained text backbones. In particular, we examine the growing convergence of architectural choices, many inspired by VLMs, as well as newer concepts such as audio tokenization and streaming.

The core of the talk focuses on providing the audience with key technical insights: audio encoders vs audio tokenizers, their respective advantages and limitations. It covers the motivations behind introducing concepts such as audio tokenizers and audio processors into transformers, shows how these design choices are reflected in the library, and explains how PyTorch tooling is leveraged to make audio a standardized modality for the open-source community.

Speakers

Eustache Le Bihan

MLE, Hugging Face

A 2024 MVA graduate, I now work on open-source audio at Hugging Face. My current focus is on standardising audio in the transformers library and strengthening support across models.

Wednesday April 8, 2026 11:35 - 11:45 CEST
Founders Cafe

GenAI & Multimodal

Audience Level Intermediate

PyTorch Conference Europe 2026

Eustache Le Bihan

Sign up or log in to save this to your schedule, view media, leave feedback and see who's attending!

Get help with the event