In the past decade, Machine Learning (ML), and especially Deep Learning (DL), has outperformed traditional rule-based algorithms on a wide variety of tasks, such as for instance image recognition, object detection and natural language processing. In CoE RAISE, we have additionally seen that ML can unlock new potential in fields such as high energy physics (HEP), remote sensing, seismic imaging, additive manufacturing, and acoustics. Training DL models, however, is no trivial task, especially if the model is large and have many hyperparameters (HP). To tackle this challenge, Hyperparameter Optimization (HPO) can be used to systematically explore the search space of possible HP configurations and, paired with the computing power of modern High Performance Computing (HPC) systems, it can drastically speed up the process of improving DL models. The aim of this talk is to introduce HPO and the major challenges data scientists face when tuning their models, as well as to give some examples from a HEP use-case where large-scale HPO on HPC systems was successfully applied.
Eric Wulff has a MSc in Engineering Physics from Lund University and is a fellow in the IT department at CERN. He is the Task Leader for the use-case on LHC collision event reconstruction at the European Center of Excellence in Exascale Computing (CoE RAISE). His experience includes large-scale distributed training and hyperparameter optimization of AI models on supercomputers as well as using quantum computing for DL-based algorithms. Prior to joining CERN, Eric was a Machine Learning Engineer at Axis Communications, where he worked on object detection and video analytics using DL techniques.