MENU

Fun & Interesting

A Scalable Platform for Training and Inference Using Kubeflow at CERN -Philipp Gadow, Diana Gaponcic

Video Not Working? Fix It Now

Don't miss out! Join us at our next Flagship Conference: KubeCon + CloudNativeCon North America in Salt Lake City from November 12 - 15, 2024. Connect with our current graduated, incubating, and sandbox projects as the community gathers to further the education and advancement of cloud native computing. Learn more at https://kubecon.io A Scalable Platform for Training and Inference Using Kubeflow at CERN - Philipp Gadow & Diana Gaponcic, CERN At CERN as in multiple other places machine learning has been gaining popularity and in multiple cases started to be deployed in production for simulation, anomaly detection and physics analysis. This talk will go into the details of how a kubeflow based machine learning platform handles all the steps from data preparation, interactive analysis, distributed training and inference. It will also expose the requirements for model repositories, versioning and metadata, hopefully stating the case for the evolution of platforms trying to solve this use case. A real world use case from the ATLAS experiment at CERN will be taken as an example to better demonstrate the benefits and challenges of using cloud native technologies for modern ML and AI workloads.

Comment