Project Goal
The project aims to use the Micron CXL-enabled memory devices as part of the ingestion and data processing chain for the L1 Scouting system at CMS, providing a coherent and seamless access to buffered data from multiple processors and compute accelerators, and a low-latency access/short term storage space for both raw and processed data at scale.
Background
The Compute Express Link (CXL) protocol is a new alternative protocol that can run over the standard PCIe physical layer, and dynamically multiplexes IO, cache and memory protocols. It is designed to empower a new generation of heterogenous and disaggregated computing with efficient resource sharing, shared memory pools, enhanced movement of operands and results between accelerators and target devices, and significant latency reduction. CMS intends to profit from the capabilities of this new technology in the online processing solution for the L1 scouting data, and in doing so will pave the way for its utilization in the wider community.
Progress in 2024
2024 began with extensive benchmarking that proved that the Micron CXL-enabled memory expansion modules, CZ120s with 256 GB capacity, were capable of previously unprecedented sustained memory bandwidth per channel at latencies approaching that of pure DRAM. Following these results, CXL support was added to the data acquisition software of the CMS L1 scouting, allowing data to be written to the memory lake prototype that hosts the Micron devices. In order to improve the capabilities of this system beyond pure direct-attached-memory addressing, we tested and deployed a new open-source CXL-enabled filesystem known as Fabric-attached memory filesystem (FAMFS), developed in tandem with Micron engineers for shared, disaggregated file management.
Three hackathons were hosted throughout the year at the CERN IdeaSquare, where work on the CXL implementation of the data acquisition software was accelerated and spotlighted.
In November, two researchers on the project were able to attend SuperComputing conference in Atlanta, Georgia, USA. In addition to using the opportunity to connect with global leaders in high performance computing, and engage with the forefront of computing innovations, the researchers were able to visit the Micron offices in Atlanta for a hand-on training of their new xCiter near-memory compute FPGA-based platform based on CXL 1.1.
Next Steps
In early 2025, we will receive a full CXL memory-lake system with up to 5.5 TB of CXL-enabled storage (22 modules) in a single chassis, accessed over the first available CXL switch. This memory chassis will be connected to at least four host PCs, which will receive the physics data from the L1 scouting system on multiple 100Gb/s TCP/IP connections.
Project Coordinator: Emilio Meschi
Technical Team: Davide Cristoforetti, Thomas Owen James, Emilio Meschi, Giovanna Lazzari Miotto, Guilherme Paulino
Collaboration Liaisons from Micron: Jason Adlard, John Benavides, Robert Bredehoft, Tony Brewer, Emanuele Confalonieri, Henri Courtier, Glen Edwards, John Groves, Andrey Kudryavtsev, Skyler Windh
In partnership with: Micron, CMS Experiment
