High-throughput computing collaboration
The high-throughput computing collaboration (HTCC) evaluates upcoming Intel technologies for potential use in the ‘online’ computing infrastructure (the ‘trigger’ and data-acquisition systems) of the LHC experiments. The investigations, which are being carried out with the LHCb experiment, are split into four main areas: (1) assessing future generations of Intel® Xeon Phi processor chips; (2) comparing the merits of different types of FPGA-based accelerators, especially the new Intel Xeon+FPGA prototypes; (3) understanding the potential benefits of Intel® Quick Assist Technology (QAT) as a hardware accelerator for encryption or compression; and (4) testing high-speed network fabrics, such as Intel® Omni-Path.
As the luminosity of the LHC is ramped up, the rate of particle collisions will significantly increase. This, coupled with upgrades to the LHC experiments, will lead to far higher data rates coming from the particle detectors. At the LHCb experiment, for example, it is expected that the data throughput following upgrades made in 2021 will be 30 times higher than it is today. Thus, to ensure that the necessary ‘trigger’ and data-acquisition systems can be put in place, it vital to test different types of computing architecture, accelerators, and network fabrics. It is also important to consider how algorithms used in these systems can be adapted to take advantage of modern hardware technologies.
In area 1, we developed a proto-application that implements a new Kalman filter algorithm, which is an important component in LHCb’s data-analysis framework. By fully taking advantage of the vectorisation technologies available and ensuring both optimal data alignment and operation scheduling, we were able to achieve close to a seven-fold speedup on an Intel® Xeon-Phi 7210 (using a production dual-socket Haswell system as a baseline). Based on this work and the lessons learnt, we created a new Kalman filter module for the LHCb experiment’s current software framework. During 2017, we also explored ways of distribute workloads across the cores of modern NUMA architectures.
In area 2, we ported a key particle-identification algorithm used by LHCb, called ‘RICH’, to a prototype hybrid system based on an Intel® Broadwell CPU and an Intel® Aria10 FPGA. Tests run on this system yielded positive results, both in terms of speed-up and energy efficiency. We also tested the algorithm used to decode raw data from LHCb’s calorimeter on this system, running over 100 times faster than on a single Intel® Xeon thread.
In area 3, we studied the use of Intel Quick Assist Technology for ‘on the fly’ data compression in LHCb’s trigger and data-acquisition system. An aggregate throughput of 120 Gb/s was achieved.
In area 4, we mostly focused on improving our understanding of the cabling and routing considerations related to the use of InfiniBand and Intel® Omni-Path. This work plays a key role in helping us to develop the most appropriate architecture for the LHCb experiment’s updated trigger and data-acquisition systems. We also began work to study aspects related to failure recovery.
Our work on the Kalman filter and RICH proto-applications has shown that the Intel® Knights Landing platform offers a compelling alternative to Intel® Xeon processors. This needs to be further benchmarked for other large cycle consumers within LHCb’s framework. The results on QAT compression are also exciting and warrant further investigation. In addition, we will further explore the use cases for the Intel Xeon+FPGA prototypes in terms of particle identification and raw-data decoding for the detector, as well as further exploring networking technologies.
- A. Amihalachioaei, Calorimeter RAW data decoding using Intel Xeon + FPGA computing platform, LHCb Online note, In publication, under review, 2017. http://cern.ch/go/T7q7
- D. H. Cámpora Pérez, O. Awile and C. Potterat, A high-throughput Kalman filter for modern SIMD architectures, Proc. Euro-Par2017: Parallel Processing Workshops, (Lecture Notes in Computer Science, Springer, 2017). http://cern.ch/go/8tzM
- D. H. Cámpora Pérez, O. Awile and O.Bouizi, Cross-architecture Kalman filter benchmarks on modern hardware platforms, Proc. ACAT, (Journal of Physics: Conference Series, IOP, 2017). http://cern.ch/go/ht6P
- D. H. Cámpora, Pérez and O. Awile, An Efficient Low-Rank Kalman Filter for Modern SIMD Architectures, Concurrency and Computation Practice and Experience, 2017. http://cern.ch/go/PQS8
- C. Faerber et al., Particle identiﬁcation on a FPGA accelerated compute platform for the LHCb Upgrade, (2017), DOI: 10.1109/TNS.2017.2715900, http://cern.ch/go/7mvK.
- C. Faerber, Acceleration of a Particle Identiﬁcation Algorithm used for the LHCb Upgrade with the new Intel®Xeon®-FPGA, TIPP17, In publication, review ﬁnished, 2017. http://cern.ch/go/9GPg
- P. Fernandez, D. del Rio Astorga, M.F. Dolz, J. Fernandez O. Awile and J.D. Garcia, Parallelizing and optimizing LHCb-Kalman for Intel Xeon Phi KNL processors, Parallel, Distributed, and Network-Based Processing, 2018. http://cern.ch/go/RQ9L
- M. Manzali et al., "Large-Scale DAQ Tests for the LHCb Upgrade," in IEEE Transactions on Nuclear Science, vol. 64, no. 6, pp. 1486-1493, June 2017. http://cern.ch/go/6rPJ
- C. Quast, A. Pohl, B. Cosenza, B. Juurlink and R. Schwemmer, Accelerating the RICH Particle Detector Algorithm on Intel Xeon Phi, Parallel, Distributed, and Network-Based Processing, 2018, submitted. http://cern.ch/go/6h9q
- B. Vőneki, S. Valat, R. Schwemmer, N. Neufeld, J. Machen and D. H. C. Pérez, "Evaluation of 100 Gb/s LAN networks for the LHCb DAQ upgrade," 2016 IEEE-NPSS Real Time Conference (RT), Padua, 2016, pp. 1-3. http://cern.ch/go/Hp7c
- B. Vőneki, S. Valat, R. Schwemmer, N. Neufeld , “RDMA optimizations on top of 100 Gbps Ethernet for the upgraded data acquisition system of LHCb”, Technology and Instrumentation In Particle Physics 2017. http://cern.ch/go/NB8C
- C. Faerber, Acceleration of a particle identiﬁcation algorithm used for the LHCb Upgrade with the new Intel®Xeon®-FPGA (25 May), Presented at TIPP17, Beijing, China, 2017. http://cern.ch/go/Mw8V
- C. Faerber, Acceleration of High Energy Physics Algorithms used for the LHCb Upgrade with the new Intel®Xeon®+FPGA (20 June), Presented at ISC17, Frankfurt, Germany, 2017. http://cern.ch/go/9Qnc
- O. Awile, Computing At CERN: Challenges and Opportunities (26 June), presented at Platform for Advanced Scientific Computing, Lugano, 2017. http://cern.ch/go/7lBW
- D.H. Cámpora Pérez, High Performance Computing Meets High Energy Physics (26 June), Presented at Platform for Advanced Scientific Computing, Lugano, 2017. http://cern.ch/go/TpM9
- C. Faerber, FPGA Acceleration for High-Throughput Data Processing in High-Energy Physics Experiments (26 June), Presented at PASC17, Lugano, Switzerland, 2017. http://cern.ch/go/FF9q
- A. Carta, Connecting the dots, (15 August), Presented at CERN openlab summer students’ lightning talks, Geneva, 2017. http://cern.ch/go/Lg8q
- L. F. De Figueiredo, Evolutionary Optimization of LHCb software compilation (15 August), Presented at CERN openlab summer students’ lightning talks, Geneva, 2017. http://cern.ch/go/DvL7
- M. Hassan Zahraee, Implementing a libfabric provider for DPDK (15 August), Presented at CERN openlab summer students’ lightning talks, Geneva, 2017. http://cern.ch/go/z7PH
- N. Neufeld, Cross-architecture Kalman filter benchmarks on modern hardware platforms (21 August), Presented at ACAT, Seattle, September 2017. http://cern.ch/go/9qgS
- D.H. Cámpora Pérez, A high-throughput Kalman filter for modern SIMD architectures (28 August), Presented at HeteroPar workshop, Santiago de Compostela, August 2017. http://cern.ch/go/N9qH
- L. Atzori, Intel High-Throughput Computing Collaboration (Platform) (21 September), Presented at CERN openlab Open Day, Geneva, 2017. http://cern.ch/go/Vh6Q
- S. Valat, Intel High-Throughput Computing Collaboration (Silicon) (21 September), Presented at CERN openlab Open Day, Geneva, 2017. http://cern.ch/go/6wxw
- C. Faerber, FPGA Compute Acceleration for High-Throughput Data Processing in High-Energy Physics Experiments (31 October), Presented at CERN Computing Seminar, Geneva, Switzerland, 2017. http://cern.ch/go/zpL6