Project Goal
CERN leads Work Package 4 (WP4) which aims at the development and expansion of AI methods along representative use-cases from research and industry, which have a strong focus on datadriven technologies, i.e., analysing data-rich descriptions of physical phenomena. The outcomes are applicable to intelligent workflows including innovative AI methods and techniques, optimized on HPC-to-Exascale systems. The tasks contain the capabilities to evaluate prototype algorithms based on experimental and/or simulation data, code performance on Exascale HPC systems, and quality of data models.
Background
WP4 contains four tasks: Event reconstruction and classification at the CERN HL-LHC, led by CERN; Seismic imaging with remote sensing for energy applications, led by CYI; Defect-free additive manufacturing, led by FM; Sound Engineering, led by UOI.
The task led by CERN consists in developing a GPU native and AI-based algorithm for particle-flow reconstruction that can easily be accelerated by modern heterogeneous hardware. This algorithm, called Machine-Learned Particle-Flow (MLPF), is developed in collaboration with CMS and acts as a representative AI use case from HEP. Some of the most important contributions from this task include the implementation and execution of distributed training and large-scale hyperparameter optimization using HPC, significantly improving physics performance. Another area of work has been to optimize developed algorithms on various heterogenous architectures.
Progress
Significant progress has been made in terms of MLPF physics performance. One large contributor to this improvement has been the generation of new and larger datasets with a new ground truth definition. The use of large-scale distributed hyperparameter optimization has continued from previous years. Furthermore, the use of model performance prediction using Support Vector Regression (SVR) and Quantum-SVR has been implemented and applied successfully to the problem of tuning the hyperparameters of MLPF.
In 2023, the focus of the MLPF effort shifted from working on closed CMS data to an open dataset that we generated ourselves. The dataset consists of electron-positron collision events at a center of mass energy of 380GeV with full GEANT4 simulation, suitable for detector reconstruction and made publicly available in the EMD4HEP format. Using this dataset, a comparison between a graph neural network and a kernel-based transformer was carried out, demonstrating that both avoid quadratic memory allocation and computational cost while achieving realistic reconstruction. Furthermore, it was shown that hyperparameter tuning on HPC significantly enhanced the physics performance of the models, improving the jet transverse momentum resolution by up to 50% compared to the baseline hand-written algorithm. In addition, the resulting model is highly portable across a variety of hardware accelerators.
Next Steps
In T4.1, the MLPF studies on the open electron-positron collision dataset has been completed and focus will shift back to simulated CMS-based datasets with proton-proton collisions. A strategic decision has been made to migrate the optimization code of MLPF from TensorFlow to PyTorch. The reason for this is the superior support for cutting edge ML algorithms offered by PyTorch as well as its suitability for easy and fast development of new algorithms. The work has already started in late 2023 but will continue in 2024.
Next steps in the data challenge line of work is to finalize 200G connectivity testing with FZJ and then to carry out XrootD/Rucio dataset testing this spring. RTU tests with GÉANT will investigate UnicoreFTP transfer service in this same time period.
Project Coordinator: Maria Girone, Andreas Lintermann
Technical Team: Maria Girone, David Southwick, Eric Wulff
Collaboration Liaisons: Marcel Aach (Forschungszentrum Jülich), Naveed Akram (The Cyprus Institute), Gabriele Cavallaro (Forschungszentrum Jülich), Kurt de Grave (Flanders Make), Andreas Lintermann (Forschungszentrum Jülich), Arnis Lektauers (Riga Technical University), Morris Riedel (University of Iceland), Nikos Savva (The Cyprus Institute), Eric Michael Sumner (University of Iceland), Liang Tian (University of Iceland), Eric Verschuur (Delft University of Technology)
In partnership with: CoE RAISE (EC funded), discover more on coe-raise.eu