Technical documents
Find the latest technical documents published by members of our collaboration below.
Title | Date | File |
---|---|---|
Validation of Deep Convolutional Generative Adversarial Networks for High Energy Physics Calorimeter Simulations
In particle physics the simulation of particle transport through detectors requires an enormous amount of computational resources, utilizing more than 50% of the resources of the CERN Worldwide Large Hadron Collider Grid. |
25-03-21 | |
Reduced Precision Strategies for Deep Learning: A High Energy Physics Generative Adversarial Network Use Case
Deep learning is finding its way into high energy physics by replacing traditional Monte Carlo simulations. However, deep learning still requires an excessive amount of computational resources. |
18-03-21 | |
The True Random Privacy Project
True Random Privacy (TRP) project, developed during the Random Power hackathon 2020, aims to create a new differential privacy solution for images, embedding a state-of-the-art features description technique. |
05-11-20 | |
Machine Learning applications on OpenStack log data analysis
A massive amount of data is generated by the Openstack cloud services in the format of service logs. Besides timestamps and log level fields, these logs contain additional information useful for pattern analysis. |
31-08-19 | |
Automation Tools for Invenio
Invenio is an open source framework, initially developed at CERN, but with many external users and contributors at this moment and prospects of growing even more in the future. Its nature as a digital |
31-08-19 | |
Graph Neural Network Inference on FPGA
Graph Neural Network possess prospect in track reconstruction for the Large Hadron Collider use-case due to high dimensional and sparse data. |
31-08-20 | |
Summer-student report: Automation Tools for Invenio
Invenio is an open source framework, initially developed at CERN, but with many external users and contributors at this moment and prospects of growing even more in the future. Its nature as a digital |
31-08-19 | |
Summer-student report: Neuromorphic Computing in High Energy Physics
At particle colliders, more data are produced than what the experiments can store for further analysis. This is why the incoming collisions are processed in real time by a so-called trigger system. At the |
31-08-19 | |
Summer-student report: IMPLEMENTATION OF A QUANTUM PERCEPTRON IN INTEL-QS
With the pervasiveness of high-speed computers and processors, computer companies are looking for new technologies to incorporate into their products and use as a competitive advantage in the market. Two modern and rapidly growing techniques are quantum computing and the use |
31-08-19 | |
Summer-student report: Performance monitoring using intel performance counters for HEP applications
The HPC service at CERN provides linux batch infrastructure to run high performance computing applications that require MPI clusters.The HPC cluster is therefore dedicated to run MPI programs. |
31-08-19 | |
Summer-student report: EOS Winston: Expert Systems for Automated Diagnosis and Remediation
This report describes EOS Winston, an event driven alerting and mitigation automation platform. Through the use of expert rules and online anomaly detection algorithms, it catches events which |
31-08-19 | |
Summer-student report: Portable Early Prediction of Sepsis from Clinical Data on Intel Myriad X
Sepsis is a life-threatening condition where microbes present in the blood stream cause an unregulated immune response from the body which can result in tissue damage, multi-organ failure |
31-08-19 | |
Summer-student-report: Deep I/O Performance Analysis of CernVM-FS using Modern Linux Tools
This report describes performance analysis of the CernVM-FS FUSE which is a software distribution service used in high-energy physics research. The performance analysis was conducted in both kernel |
31-08-19 | |
Summer-student report: EOS Integration into OpenStack Manila
The purpose of this report is to provide a brief overview of what OpenStack is, focusing on the advantages of the integration of its Manila component at CERN. Furthermore, this document briefly |
31-08-19 | |
Summer-student-report: Continuous integration for containerized scientific workflows
On this project, we decided to implement two solutions that integrate REANA and GitLab. They vary on two main points. The first one is the amount of configuration necessary to set up the integration, |
31-08-19 | |
Summer-student-report: Building effective Restful APIs with Oracle Rest Data Services 19
In 2005, the first installation of the Oracle HTML DB came out in production. Very soon the CERN developer community adopted the technology, using it in all the areas of the organization, from administrative applications to accelerators control system. |
31-08-19 | |
Summer-student-report: Web - UI development IoT Security Framework
The IoT security framework is a computer security platform designed to assess the risks of various heterogeneous IoT devices. The framework is currently being developed at CERN and analyses different IoT devices connected to CERN’s General Purpose Network (GPN). The GPN mostly |
31-08-19 | |
Calorimetry with Deep Learning: Particle Simulation and Reconstruction for Collider Physics
Using detailed simulations of calorimeter showers as training data, we investigate the use of deep learning algorithms for the simulation and reconstruction of particles produced in high-energy physics collisions. |
14-12-19 | |
Summer-student-report: Evaluation of Erasure Coding and other features of Hadoop 3
Hadoop ecosystem is distributed computing platform for Big Data solutions by comprising autonomous components such as HDFS, Spark, YARN etc. HDFS is a Hadoop Distributed File System for data storage. Current HDFS supports 3x replication for data fault-tolerance. When a |
31-08-19 | |
Summer-student-report: Big Data Analysis and Machine Learning at Scale with Oracle Cloud Infrastructure
This work has successfully deployed two different use cases of interest for High Energy Physics using cloud resources: CMS Big data reduction: This use case consists in running a data reduction workloads for |
31-08-19 | |
Summer-student-report: Function-as-a-Service on Kubernetes using Knative
The CERN Cloud Infrastructure team provides compute resources as a service to teams across CERN. Users can provision resources to process experiment data, host web applications, and accomplish other computing tasks.
|
31-08-19 | |
Summer-student-report: Benchmarking and optimising large scale parallel workflows
The main idea of this project is to carry out performance analysis on the RDataFrame class within the ROOT operational framework. For this purpose, scalability analysis are performed on the execution |
31-08-19 | |
Summer-student-report: Anomaly Detection in the Elasticsearch Service
The Elasticsearch Service is a distributed search and analytics engine widely used across CERN. Currently, issues in the service are resolved manually after being detected through internal monitoring by service |
31-08-19 | |
Summer-student-report: Benchmarking tools for NextGen Archiver for WinCC OA
On this project we focused on benchmarking Influx against Oracle database. One of the primary reason is ETM/Seimens were already working on Influx database backend. To perform benchmarking using the Query Benchmark Tool we needed to have same data |
31-08-19 | |
Summer-student-report: Performance study of parquet codecs
This report describes the work carried out to study and evaluate the performance and footprint of different parquet compression codecs on data retrival and analytics scenarios Parquet is a standard-de-facto and the data format used to persist |
31-08-19 | |
Summer-student-report: Improving BioDynaMo build system
When developing new programs or scientific libraries most of the efforts are focused on providing efficient algorithms, the state-of-the-art techniques and maximum flexibility. However, in order for a |
31-08-19 | |
Summer-student-report: Evaluate ElastAlert for IT-DB use cases
The Database Services Group (IT-DB) is responsible for providing database and middleware services to the laboratory. For these services, it is necessary to provide proper monitoring solutions to different user |
31-08-19 | |
Summer-student-report: Real-Time Server Monitoring and CNN Inference on FPGA
Neutrinos are subatomic particles, very similar to an electron, but without any electrical charge and a very negligible rest mass. They are the most abundant and perhaps the most mysterious matter particles in the universe!
|
31-08-19 | |
Summer-student-report: Using deep learning for particle identification and energy estimation in CMS HGCAL L1 trigger
In run 4 of the LHC, the extreme high luminosity is expected to generate an enormous pileup of up to 200 proton-proton collisions for each bunch crossing. This has to be read out at 750 kHz with a maximum |
31-08-19 | |
Summer-student-report: Apache Spark on Hadoop YARN & Kubernetes for Scalable Physics Analysis
Big Data Technologies popularity continues to increase each year. The vast amount of data produced at the LHC experiments, which will increase further after the upgrade to HL-LHC, makes the exploration of new ways to perform physics |
31-08-18 | |
Summer-student-report: HGCAL Fast Simulation with Deep Learning
This project uses Wasserstain Generative Adversatial Networks (WGANs) to supply the demand for large simulation samples in the event of the CMS Phase II Upgrade. The distributions of real |
31-08-18 | |
Summer-student-report: Achieve a 0-downtime CERN Database infrastructure
At CERN we have many systems which provide critical services and scheduling downtime for them is quite difficult. Live kernel patching is a technique which aims to update the system without |
31-08-18 | |
Summer-student-report: Introducing heterogeneous farms in the CMS framework
The High Luminosity upgrade scheduled for 2026 will greatly increase the number of events per collision. Moore’s law will optimistically get a factor 4 performance gain, not enough to handle the |
31-08-18 | |
Summer-student-report: Java Mission Control Evaluation
This reports summarises the project I worked on during my internship with the IT-DB-IMS team. This report will detail my efforts to configure various technologies to work with Java Mission Control, the |
31-08-18 | |
Summer-student-report: Efficient unpacking of required software from CERNVM-FS
In recent times a tool for efficient unpacking of software work-flows from CernVM File System (CVMFS) into standalone images has become necessary. There are two types of use cases for such images: On the one hand they can be used to deliver |
31-08-18 | |
Summer-student-report: Benchmarking Machine Learning in HEP
The interest on machine learning workloads in the HEP community has increased exponentially in the last years, making more and more important the need of a thorough benchmarking of the most relevant/significant workloads that are going to run on the experiments. The purpose |
31-08-18 | |
Summer-student-report: Evaluating Ceph Deployments with Rook | 31-08-18 | |
Summer-student-report: Scanning Containers for Vulnerabilities on Kubernetes Clusters
On this project, we chose to work with Clair, the tool developed by CoreOS, which uses static analysis to find vulnerabilities in container images. To use Clair, we had to build a Python client, |
31-08-18 | |
Summer-student-report: Benchmarking Kudu and Oracle in typical WinCC OA historical data retrieval use cases
WinCC Open Architecture is a toolkit for creating Supervisory Control and Data Acquisition (SCADA) applications, which is widely used at CERN. Hundreds of controls applications, both in the accelerator complex and the experiments are based on it, |
31-08-18 | |
Summer-student-report: KPIs Dashboard for Invenio-Related Services
The purpose of this report is to document the project I was working on for nine weeks during the summer of 2018. As part of the CERN openlab Summer Student Program 2018 I had the opportunity to work with the Digital Repositories (IT-CDA-DR) section at CERN on developing a |
31-08-18 | |
Summer-student-report: Technical Network Validation Using Open-shift
The interest in using containers to package applications is constantly growing in the software development community, especially with new technologies such as Kubernetes, Open-shift being adopted more frequently as well. This project also based on modularising the currently |
31-08-18 | |
Summer-student-report: Automated Shelter Recognition in Refugee Camps
In June 2018, more than 68.5 Million people across the globe were reported to be fleeing war or persecution. Within the United Nations, UNOSAT is the organ in charge of collecting demo- |
31-08-18 | |
Summer-student-report: Develop streaming pipelines and analytics solutions for CERN's IoT Platform
There are two very popular concepts that we hear in the world of technology, Big Data and Internet of Things. Big data is referring to a data which size, complexity and velocity is really high and is difficult to capture, pre-process and analyze it with |
31-08-18 | |
Summer-student-report: Distributed BioDynaMo
Computer simulations have become a very powerful tool for scientific research. In order to fa- cilitate research in computational biology, the BioDynaMo project aims at a general platform for |
31-08-18 | |
Summer-student-report: GPGPU Accelerated Beam Dynamics Interfacing PyHEADTAIL with SixTrackLib
Simulations of beam dynamics vastly profit from parallelisation with high performance computing tech- niques. The two simulation libraries SixTrackLib and PyHEADTAIL are GPGPU accelerated. The former |
31-08-18 | |
Summer-student-report: Optimization of Data Transfer for 100 Gb/s Ethernet
In 2019 the LHCb experiment will go through an important upgrade, that will improve performance in many fields. One oh these fields is the DAQ system: it consists of a big flow of data that comes |
31-08-18 | |
Summer-student-report: Employing HPC for Heterogeneous HEP Data Processing
One of the most time consuming algorithms that is currently employed for the reconstruction of High Energy Physics (HEP) workflows is the local energy reconstruction. The time spent to execute this algorithm constitutes 24% of the total processing time, thus achieving substantial |
31-08-18 | |
Summer-student-report: POSEIDON - Analyzing the secrets of the Trident Node monitoring
Improving the performance of an application is an important objective carried out from the application conception until its deprecation. Developers are constantly trying to improve the performance of their |
31-08-18 | |
Summer-student-report: yXRootD PyPI distribution and new declarative file access API for XRootD Client
The project described in this report is related to XRootD framework development. It was divided into two parts. First part was about publishing XRootD python bindings called PyXRootD to Python Package Index. This makes PyXRootD installation much easier and resolves problem |
31-08-18 | |
Summer-student-report: Parallel Task Execution
Puppet is a great tool for making changes on systems, and ensuring that those changes happen. But Puppet is not intended to make this happen on many systems at the same time. Puppet is intended for eventual compliance over time. Each agent checks in over a period of time, al- |
31-08-18 | |
Summer-student-report: Thin Element Comparison Between MAD-X and SixTrack
In this report thin, single elements were compared between MAD-X and SixTrack. A testing framework for efficient comparisons between the two tracking codes was developed. A few dif- ferences between the tracking codes were found then documented and two bugs, one in the |
31-08-18 | |
Summer-student-report: OpenStack Infrastructure Optimization Service
CERN operates an OpenStack based private cloud to provide its users with resources on demand. It is one of the largest OpenStack deployments in the world, with more than 300,000 cores over 9,000 hypervisors [1]. |
31-08-18 | |
Summer-student-report: MPI Learn - distributed training
MPI Learn is a framework for the distributed training of neural networks. This platform is aimed at machine learning users, who can use it to train models faster, without dealing with the com- |
31-08-18 | |
Summer-student-report: Function as a Service
Function as a service (FaaS) is a category of cloud computing services that provides a platform allowing customers to develop, run, and manage application functionalities without the complexity of building and maintaining the infrastructure |
31-08-18 | |
Summer-student-report: Natural Language Processing for Scientific Research
The goal of this Openlab project is to create a Smart Data Analytics Platform for Science that will host analytical tools, publish data, share resources, interact with bots, collaborate and build communities of researchers with various backgrounds in a single ecosystem. With |
31-08-18 | |
Summer-student-report: Deep Representation Learning for Trigger Monitoring
We propose a novel neural network architecture called Hierarchical Latent Autoencoder to exploit the underlying hierarchical nature of the CMS Trigger System for data quality monitoring. |
31-08-18 | |
Summer-student-report: Evaluation of Containers for HPC
Some of the main challenges in scientific computing today deal with performance-preserving portability of software and reproducibility of the final results; likewise, with the advent of modern |
31-08-18 | |
Summer-student-report: Information aggregation and analytics for ATLAS Frontier
Squid-Frontier system [1] is currently used to manage access to the COOL database [2]. This system includes many widely distributed computing sites and applications. Clients presented by PanDA (Production ANd Distributed Analysis system, the ATLAS’ |
31-08-18 | |
Summer-student-report: Malware analysis management
Malware Analysis Management (M.A.M.) or the automated sandbox analysis of quarantined malware samples focuses on a detailed analysis of malware samples reaching CERN through email traffic. M.A.M. is a side process of the main email pipeline |
31-08-18 | |
Summer-student-report: REANA - user dashboard for reusable analysis platform
REANA is a reusable analysis platform which offers physicists the ability to structure their research data analysis and run their computational workflows in a containerized computing cloud. |
31-08-18 | |
A New Platform for Large-Scale Biological Simulation
Computer simulations have become a very powerful tool for scientific research. In order to facilitate research in computational biology, the BioDynaMo project aims at a general platform for biological computer simulations, which should be executable on hybrid cloud computing systems. |
28-11-16 | |
From Physics to industry: EOS outside HEP
In the competitive market for large-scale storage solutions the current main disk storage system at CERN EOS has been showing its excellence in the multi-Petabyte high-concurrency regime. |
01-09-17 | |
Exploring RapidIO Technology within a DAQ System Event Building Network
Exploring RapidIO RapidIO (http://rapidio.org/) technology is a packet-switched high-performance fabric, which has been under active development since 1997. The technology is used in all 4G/LTE basestations worldwide. |
10-05-17 | |
RapidIO as a multi-purpose interconnect
RapidIO (http://rapidio.org/) technology is a packet-switched high-performance fabric, which has been under active development since 1997. Originally meant to be a front side bus, it developed into a system level interconnect which is today used in all 4G/LTE base stations world wide. |
20-06-17 | |
A Deep Learning tool for fast detector simulation
Machine Learning techniques have been used in different applications by the HEP community: in this talk, we discuss the case of detector simulation. |
27-06-18 | |
An optimization approach for agent-based computational models of biological development
Current research in the field of computational biology often involves simulations on high-performance computer clusters. It is crucial that the code of such simulations is efficient and correctly reflects the model specifications. |
10-03-18 | |
CERN openlab: Engaging industry for innovation in the LHC Run 3-4 R&D programme
LHC Run3 and Run4 represent an unprecedented challenge for HEP computing in terms of both data volume and complexity. New approaches are needed for how data is collected and filtered, processed, moved, stored and analysed if these challenges are to be met with a realistic budget. |
04-03-17 | |
Extending an asynchronous messaging library using an RDMA-enabled interconnect
As computing power and I/O performance is increasing at an aggressive rate several RDMA enabled interconnect technologies have been entering the market, promising low latency and high throughput. |
20-03-17 | |
1000 things you always want to know about SSO but you never dare to ask | 12-10-17 | |
CERN openlab White Paper: Future IT Challenges in Scientific Research
In this white paper, CERN openlab sets out challenges to tackle together through joint R&D projects with our industry collaborators over the coming years.This unique public-private partnership between research and leading ICT companies is ideally placed to tackle these challenges, producing r |
21-09-17 | |
Exploring RapidIO Technology Within a DAQ System Event Building Network
RapidIO technology is a packet-switched high-performance fabric, which has been under active development since 1997. The technology is used in all 4G/LTE base stations worldwide. |
09-09-17 |