Chicago Python User Group (Chipy) x mHUB Social

Chicago Python User Group (Chipy) x mHUB Social

About This Event

Inference at Scale: Transcribing Millions of Insurance Calls with Whisper and Azure ML Jimmy Scray | Intermediate | ~25 min Ever wondered what it takes to run AI models at truly massive scale? This talk goes beyond the "hello world" of ML to show what production inference actually looks like. Jimmy walks through a real-world system built to transcribe millions of insurance call recordings using OpenAI's Whisper model on Azure ML — sharing the Python code, architecture decisions, and hard-won lessons along the way. You'll learn about: Scaling from a notebook prototype to a distributed pipeline across thousands of GPU workers Benchmarking CPU vs. GPU workloads and maximizing throughput Orchestrating jobs with Azure Machine Learning Handling spot-instance interruptions gracefully Writing resilient Python that can recover from failures and resume automatically This one is less about ML theory and more about inference engineering — the unglamorous but critical work of making models fast, reliable, and cost-effective in production. Great for anyone interested in: Python, distributed

See the rest of the description and register on mHUB.

Share Event

Date & Time

Thursday, July 9, 2026

2:30 PM - 4:35 PM

Location

mHUB

mHUB, 1623 West Fulton Street, Chicago, IL, USA