By working with communities beyond high-energy physics, we are able to ensure maximum relevancy for CERN openlab’s work, as well as learning and sharing both tools and best practices across scientific fields. Today, more and more research fields, such as medical research or space and Earth observation research, are driven by large quantities of data, and thus experience ICT challenges comparable to those at CERN. CERN openlab’s mission rests on three pillars: technological investigation, education, and dissemination. Collaborating with research communities and laboratories outside the high-energy physics community brings together all these aspects. Challenges related to the life sciences, medicine, astrophysics, and urban/environmental planning are all covered in this section, as well as scientific platforms designed to foster open collaboration.

 

CERN Science for Open Data

Project goal

The main objective of the CERN Science for Open Data (CS4OD) project is to define and implement common principles, best practices, and tools for data management, data analysis and reproducibility of results. These are to be applied across different research communities and based on open-access policies.

CS4OD is building an integrated platform which will provide users with a broad catalogue of cutting-edge tools and services for data management, analysis and reproducibility. These tools have been developed either at CERN or through established open-source initiatives. Examples include Zenodo, REANA, SWAN, and Jupyter Lab.

The platform is designed to:

  1. Provide transparent and effective “data stewardship” for publicly accessible data in multi-domain fields.
  2. Adapt to different user profiles, enabling researchers with different backgrounds to benefit more easily from the latest technologies, enhancing reproducibility and contributing to open science.
  3. Enable participants to contribute, share, access, and manage data from multiple heterogeneous sources with permanent unique identifiers.
  4. Design and execute data-curation and data-analysis pipelines using integrated tools and services, on different (local or cloud) hardware resources.

The first release of the platform has been deployed here https://cs4od-platform.web.cern.ch/.

R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio, Tim Smith
Team members
Alexander Ioannidis (project manager), Anna Ferrari, Ivan Knezevic, Ines Pinto Perreira Da Cruz, Nihal Ezgi Yuceturk, Jose Benito Gonzalez Lopez
Collaborator liaison(s)
Ilaria Capua, Luca Mantegazza, Elio Borgonovi, Benedetta Pongiglione, Claudio Buongiorno Sottoriva, Massimiliano Di Cagno, Massimo Pugliese, Vladimiro Guarnaccia, Peter Grübling

Collaborators

Project background

Global crises, like the COVID-19 pandemic, have highlighted the need to increase the pace at which data is collected, organised, analysed and shared at large scale. This is vital for supporting rapid, informed and accountable response mechanisms from governments and other organisations. Achieving this will play an important role in addressing critical and urgent medical, social, economic and educational challenges.

There is a recognised difficulty in implementing large-scale, cross-disciplinary investigations that are able to access large amounts of data from multiple sources. For such investigation to be effective, barriers related to data management, governance, access, scalability and reproducibility must be overcome.

Today, different research groups use different data, different assumptions, different models and different methods. This means they can come to conclusions that cannot be objectively challenged because other research teams do not have access to the same information and cannot reproduce the work. And, in the case of successful research, it can be difficult for other teams to build upon it further.

CERN has a long, proven track record for open science and for implementing and managing large-scale, data-driven operations. In collaboration with international initiatives and projects, CERN engineers and physicists have developed efficient strategies for managing data at scale, as well as tools for supporting such strategies. Optimised and efficient systems — combined with the experience of implementing distributed systems and a strong culture of openness and sharing ideas, software and data — make CERN an ideal partner for implementing multi-disciplinary data-driven research projects based on open-access data and open-source tools.

Project timeline

The project started in March 2021 and is set to run for three years.

Year 1: Analysis of use cases, requirements, technology and functional gaps. A minimum-viable-product prototype will be tested with early users.

Year 2: Iterative integration of functions, tools, and best practices.

Year 3: A public beta version will be released and extended to address additional use cases. It will be deployed on infrastructures outside CERN.

Recent progress

During 2021, experts from CERN openlab and the CERN IT department have collaborated with researchers at the One Health Center of Excellence in Florida, US, as well as at both Bocconi University and Milano-Bicocca University in Milan , both in Italy. Together, we have defined the initial requirements for the data-management and computing infrastructure. An initial set of pilot use cases is being investigated for the design of the first release of the CS4OD platform. These include: analysis of excess mortality related to COVID-19, resistance to antibiotics, taxonomy of plant diseases, analysis of cancer patients’ data, and classification of Parkinson’s disease.

Next steps

The integration with additional analysis frameworks (e.g. Tensor Flow) and distributed computing frameworks (e.g. OpenFL) will be investigated in Q1 2022.

Publications

    D. Patsidis, A. Ferrari, Platform for Reproducible Analyses. Published on Zenodo, 2021. cern.ch/go/6lHw

Presentations

    A. Di Meglio, M. Manset, A Social-Technological Platform for Making Sense of (Medical) Data (23 January). Presented at CERN openlab Technical Workshop, Geneva, 2020. cern.ch/go/Mb8X
    A. Ferrari, I. Knezevic, A. Ioannidis, J. B. G. Lopez, CERN Science for Open Data – The CS4OD Project (10 March). Presented at CERN openlab Technical Workshop, Geneva, 2021. cern.ch/go/b8n8
    A. Ferrari, I. Knezevic, D. Patsidis, A. Di Meglio, A. Ioannidis, I. P. P. Da Cruz, N. E. Yuceturk, J. B. G. Lopez, T. Roun, T. Smith, CERN openlab / CERN Science 4 Open Data (CS4OD): use cases for the life science (18 October). Presented at ExaHealth, Geneva, 2021. cern.ch/go/QL9C
    D. Patsidis, Data Platform for collection, storage, integration, analysis and distribution (6 Sepember). Presented at CERN openlab Summer Student Lightning Talk, Geneva, 2021. cern.ch/go/wt7v

Humanitarian AI applications for satellite imagery

Project goal

This project is making use of expertise in artificial intelligence (AI) technologies at CERN to support a UN agency. Specifically, we are working on AI approaches to help improve object recognition in the satellite imagery created to support humanitarian interventions. Such satellite imagery plays a vital role in helping humanitarian organisations plan and coordinate responses to natural disasters, population migrations, and conflicts.

R&D topic
Applications in other disciplines
Project coordinator(s)
Sofia Vallecorsa
Team members
Suren Thapa
Collaborator liaison(s)
Lars Bromley, Edoardo Nemni

Collaborators

Project background

Since 2002, CERN has hosted UNOSAT, the Operational Satellite Applications Programme of UNITAR (The United Nations Institute for Training and Research) on the laboratory’s premises. UNOSAT acquires and processes satellite data to produce and deliver information, analysis, and observations to be used by the UN or national entities for emergency response, to assess the impact of a disaster or a conflict, or to plan sustainable development in the face of climate change.

At the heart of this project lies the idea of developing machine-learning techniques that can help speed up analysis of satellite imagery. For example, predicting and understanding the movement of displaced persons by identifying refugee shelters can be a long, labour-intensive task. This project is working to develop machine-learning techniques that could greatly reduce the amount of time needed to complete such tasks.

Recent progress

In 2020, we focused on the challenge of simulating synthetic high-resolution satellite images. High-resolution satellite imagery is often licensed in a way that makes it difficult to share it across UN partners and academic organisations. This reduces the amount of image data available for training deep-learning models, thus hampering research in this area. We have developed a generative adversarial network (GAN) that is capable of generating realistic satellite images of refugee camps. Our tool was initially based on a progressive GAN approach developed by NVIDIA. We have now developed this further, such that it can combine multiple simulated images into a cohesive larger image of roughly 5 million pixels. The new model is built on a multi-network architecture combining several auto-encoders; their output is used to condition the image-generation step and to ensure each new image is consistent with previous ones. This method was tested on satellite images of a flooded area in Myanmar.

Next steps

Next year will be dedicated to the optimisation of the progressive GAN model. In particular, we will implement a distributed approach for training our network in parallel across multiple nodes. This should help us to reduce training time for the model and increase the maximum image size.

Publications

    N. Lacroix, T. Aliyev, L. Bromley: Automated Shelter Recognition in Refugee Camps. CERN openlab Summer Student Report. Published on ZENODO, 2019. cern.ch/go/v6rn

Presentations

    Y. Boget, ProGAN on Satellite images (15 August). Presented at CERN openlab summer student lightning talk session, Geneva, 2019. cern.ch/go/P6NV
    Y. Boget, S. Vallecorsa, Deep Learning for Satellite Imagery (24 September). Presented at IXPUG Annual Conference, Geneva, 2019. cern.ch/go/m9n6

SmartANOMALY Spikefall

Project goal

This project was launched in the context of the fight against the COVID-19 pandemic, as a CERN openlab collaboration with the Italian Institute of Technology and CompBioMed. Through this project, we aim to propose a model based on machine learning for simulating the enhanced molecular dynamics of SARS-CoV-2’s spike glycoprotein. Proteins’ qualities can be described by their secondary structures, three-dimensional forms made of atoms groups under which protein residues (individual amino acid) can exist. The goal of SmartANOMALY Spikefall is to predict this change in structure by analysing just a few moments, or ‘frames’.

R&D topic
Applications in other disciplines
Project coordinator(s)
Sofia Vallecorsa
Team members
Yann Donon
Collaborator liaison(s)
Sauro Succi, Walter Rocchia, Nicola Scafuri

Collaborators

Project background

The energy cost for simulating proteins is high. As such, a new approach for predicting the qualities of proteins would be a very powerful tool for the research community. Such an approach could be a particular boon to the fight against the COVID-19 pandemic, with researchers seeking to accelerate their investigations.

This project is being carried out in the context of CERN's strategy for knowledge transfer to medical applications, led by CERN's Knowledge Transfer group.

Recent progress

The project started in the fourth quarter of 2020, with several approaches for classification and behaviour prediction tested immediately. Currently, we are studying the behaviour of single atoms — in particular, alpha carbons — over a short period of time. Our first observations have led us to believe that understanding the instability of an atom can provide us with insight into the long-term behaviour of a proteins’ secondary structure. 

Next steps

Our first results are promising, encouraging us to investigate further. Rather than continuing to look at single atoms, we will now studying the behaviour of proteins’ secondary structures in full, increasing the complexity and precision of results.

Following our first steps in 2020, we will now work to expand the project collaboration, with a view to helping us achieve the full potential of this work.

SmartANOMALY

Project goal

The SmartANOMALY project is an evolution and broadening of the SmartLINAC project, which launched in June 2019. The main goal of the original project was to create a platform for anomaly detection and maintenance planning for linear accelerators, which are used widely in medicine and high-energy physics research.

R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio
Team members
Yann Donon

Collaborators

Project background

Technologies related to artificial intelligence (AI) are opening up new possibilities for anomaly detection. Given the array of large particle accelerators at CERN, the Organization has significant expertise in detecting anomalies in highly complex systems. This expertise has the potential to be applied to a range of scientific and industrial activities including (but not limited to) other fields where particle accelerators are used, such as medicine. This project has been supported by CERN's Knowledge Transfer group.

Recent progress

After more than a year of development, promising results were achieved, demonstrating the potential of our innovative algorithms for detecting anomalies — as well as perhaps even predicting their effects to some extent. Today, the project’s primary focus is on medical accelerators. However, we see potential in training our solution on more sources, such as on compressor engines or complex industrial processes.

It is common practice to use alternative data sets when training anomaly-detection systems. Therefore, the distinguishing aspect of our research is that several approaches, based on statistics and neural-network technologies, are being combined in order to offer a system that can be adapted to different sources.

Given that demand for such tools is growing rapidly, we believe the time is right to formally enlarge the scope of the research started through SmartLINAC. Thus, we have created our new, broader SmartANOMALY project.  

Next steps

This new project incorporates our existing investigations with linear accelerators, and will also allow new actors to take part in the development of the anomaly detection tool for complex systems. We are currently discussing possible applications in the automotive and food-processing industries. Given that our research is now entering a new phase, we encourage actors from industry and academia to get in touch with us. We are keen both to develop the existing activities within this project and to explore new opportunities for enlarging its scope.

Publications

    Y. Donon, Smart Anomaly Detection and Maintenance Planning Platform for Linear Accelerators (3 October). Presented at the 27th International Symposium Nuclear Electronics and Computing (NEC’2019), Montenegro, 2019. cern.ch/go/nb9z

Early detection of Parkinson's disease

Project goal

Our work will be organised into three main areas:

  1. Identification of one or more suitable datasets from existing public data providers (e.g. the Michel J. Fox Foundation).
  2. Implementation of a supervised learning strategy to analyse labelled data and classify patients affected by Parkinson’s disease.
  3. Implementation of an unsupervised learning strategy to detect and diagnose potential Parkinson’s disease symptoms using anomaly-detection algorithms or other suitable approaches. This will then correlate relevant features (e.g. duration and intensity of the symptoms) with medical treatments and other factors.
R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio, Sofia Vallecorsa
Team members
Anna Ferrari
Collaborator liaison(s)
Daniela Micucci, Paolo Napoletano

Collaborators

Project background

CERN openlab is currently running a project called CERN LivingLab to set up a distributed data-analysis platform providing specialised features to process data with sensitive content, such as personal or medical information. The platform is intended to be a technology demonstrator and a testbed for state-of-the-art functionalities, including advanced machine learning and deep-learning tools and algorithms, secure data transmission and storage, and encryption techniques.

This project is being carried out in the context of CERN's strategy for knowledge transfer to medical applications, led by CERN's Knowledge Transfer group.

Recent progress

Two public datasets released by the Michael J. Fox Foundation have been used for analysis. Both presented inertial data recorded from one smart-phone and one smart watch. Data was recorded during everyday life and was manually labelled by the user. The availability of labelled data enabled the use of supervised deep-learning and machine-learning techniques. Different kinds of strategies were selected and implemented: convolutional neural networks were applied to time series and extracted images.

The dearth of data led to insufficient performance in terms of accuracy. Furthermore, the high variability of the data meant an ad-hoc pre-processing phase was required, followed by an additional feature-extraction procedure. Both were implemented and used to feed traditional machine-learning algorithms, which then performed better than the analysis based on deep learning.

Next steps

The pre-processing and feature-extraction procedures are crucial steps for the reliability and sustainability of the results. We will work to improve these two procedures in 2021, enhancing the performance of the machine-learning algorithm.


Presentations

    A. Ferrari, Deep Learning Analysis on Wearable Devices (23 January). Presented at CERN openlab Technical Workshop, Geneva, 2020. cern.ch/go/bKR6

CERN Living Lab 

Project goal

The project goal is to develop a big-data analytics platform and tools for large-scale studies of data under special constraints, such as information that is privacy-sensitive, or that has a varying level of quality, associated provenance information, or signal-to-noise ratio. Ethical considerations are also considered when necessary. This will serve as a proof-of-concept for federating and analysing heterogeneous data from diverse sources, in particular for medical and biological research, using ideas and expertise coming from CERN and the broader high-energy physics community.

R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio
Team members
Jose Cabrero, Anna Ferrari, Sofia Vallecorsa
Collaborator liaison(s)
David Manset (be-studys), Marco Manca (SCImPULSE)

Collaborators

Project background

CERN is a living laboratory, with several thousand people coming to work at its main campuses every day. For operational purposes, CERN collects data related to health, safety, the environment, and other aspects of daily life at the lab. Creating a platform to collate and enable intelligent management and use of this data — while respecting privacy and other ethical and legal obligations — offers the potential to improve life at the lab. At the same time, such a platform provides an ideal testbed for exploring new data analytics technologies, algorithms and tools, including machine-learning (ML)/deep-learning (DL) methods, encryption schemes, or block-chain-based ledgers. It also provides a natural bridge to collaborate with other scientific research domains, such as medical research and biology.

This project is being carried out in the context of CERN's strategy for knowledge transfer to medical applications, led by CERN's Knowledge Transfer group.

Recent progress

In 2020, the project activities focused mainly on the investigation of privacy-preserving techniques for data analysis, particularly in cases where machine-learning or deep-learning models are used. A systematisation of the state-of-the-art was conducted looking at different methodologies, such as homomorphic encryption, secure multi-party computation and federated learning. The existing implementations and their capabilities were assessed against reference use cases, including the extraction of features from brain MRI scans and aggregated data classification for epidemiological research. In 2020, two new collaborators joined the initiative: the University of Madrid, Spain, and the Seoul National University Bundang Hospital (SNUBH), South Korea, sharing expertise in security and the analysis of medical data.

Next steps

After our initial systematisation of knowledge related to privacy-preserving methods, we will begin work to develop one or more methods, integrating them into the ML/DL inference algorithms of the reference use cases. Extension to the full model training process will then be addressed.

In November 2020, CERN openlab entered into a collaboration with the OpenQKD project to assess the use of distribution infrastructures for the quantum keys used for secure analysis of data. The integration of QKD in the data analysis process will be investigated as an additional layer for protecting transactions.


Presentations

    T. Aliyev, Meaningful Control of AI and Machine Ethics (7 June). Presented at Big Data in Medicine: Challenges and Opportunities, CERN, Geneva, 2019. cern.ch/go/J7CF
    A. Di Meglio, The CERN Living Lab Initiative (20 June). Presented at CERN Information Technology for the Hospitals, HUG, Geneva, 2019. cern.ch/go/Fld8
    T. Aliyev, Interpretability and Accountability as Necessary Pieces for Machine Ethics (2 July). Presented at Implementing Machine Ethics Workshop, UCD, Dublin, 2019. cern.ch/go/7c6d
    A. Di Meglio, The Living Lab Project (23 January). Presented at CERN openlab Technical Workshop, CERN, Geneva, 2020. cern.ch/go/Cf7R

Circular Health

Project goal

The Circular Health project involves a collaboration of research institutes, universities, and not-for-profit organisation led by the One Health Center of Excellence at the University of Florida, US. Together, we are working on the definition, design and implementation of a large-scale open-access data platform to support novel paths to collaborative research for global challenges. Circular Health aims to contribute with both efficient technical tools and methodologies for governance.

R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio
Team members
Anna Ferrari
Collaborator liaison(s)
Ilaria Capua, Luca Mantegazza, Elio Borgonovi, Claudio Bellariva

Collaborators

Project background

Global crises, like the COVID-19 pandemic, have shown the importance of accelerating multi-disciplinary research and easing barriers to locating, aggregating, processing and sharing data and results. Doing so is necessary for addressing complex challenges that involve medical, social, or economic data.

CERN has years of experience in designing large-scale, collaborative data platforms and has developed efficient tools like Zenodo, SWAN and REANA to facilitate sharing, reproducibility and collaboration. Through Circular Health, CERN can contribute to solving critical issues and can support scientific research beyond high-energy physics.

This project is being carried out in the context of CERN's strategy for knowledge transfer to medical applications, led by CERN's Knowledge Transfer group.

Recent progress

CERN started its collaboration with Circular Health as part of the CERN Against COVID-19 Task Force, which was established by CERN’s management in March 2020. During 2020, experts from CERN openlab and the CERN IT department have collaborated with researchers at the One Health Center of Excellence in Florida, US, as well as at both Bocconi University and Milano-Bicocca University in Milan. Together, we  defined the initial requirements for the data and computing infrastructure. We then deployed a first project focused on assessing excess mortality linked to COVID-19 through comparison with data from previous years.

Next steps

At the end of 2020, following our initial investigations, Fabiola Gianotti, the CERN Director-General, supported the creation of a new dedicated project called CERN Science for Open Data (CS4OD). Its mission is to integrate CERN tools and expertise into a platform for supporting international projects. It will use open-access data and will support the CERN-linked UN Sustainable Development Goals.

 

BioDynaMo

Project goal

We are aiming to create a platform through which life scientists can easily create, run and visualise three-dimensional biological simulations. Built on top of the latest computing technologies, the BioDynaMo platform will enable users to perform simulations of previously unachievable scale and complexity, making it possible to tackle challenging scientific research questions.

R&D topic
Applications in other disciplines
Project coordinator(s)
Roman Bauer, Fons Rademakers
Team members
Lukas Breitwieser, Jean de Montigny, Ahmad Hesam
Collaborator liaison(s)
Uri Nevo, Marco Durante, Vasilis Vavourakis, Klaus-Dieter Oertel

Collaborators

Project background

Within the life-sciences community, computer simulation is being used more and more to model increasingly complex biological systems. Although many specialised software tools exist, establishing a high-performance, general-purpose platform would be a major step forward. CERN is therefore contributing its deep knowledge in large-scale computing to this collaboration, supported by Intel. Together, we are working to develop a unique platform. This project is co-financed by the CERN budget for knowledge transfer to medical applications.

Recent progress

In 2020, we generalised the simulation engine to make it applicable to other application areas beyond cell-based simulation. We used these improvements to implement a simple agent-based SIR model that simulates the spread of infectious diseases. Based on the results of this work, CERN and the University of Geneva started collaborating on a more detailed model for investigating realistic viral spread between individuals. For this purpose, BioDynaMo has been coupled with a fluid-mechanic simulation framework to simulate the precise spread of aerosols and droplets, in addition to agents' behaviour.

Furthermore, we continued our work to improve the performance and usability of BioDynaMo. We completely redesigned the visualisation component and achieved a speedup of up to two orders of magnitude. CERN technology (the ROOT framework) played a central role in this improvement.

Lastly, as part of the CERN openlab summer-student programme, we launched BioDynaMo Notebooks: an interactive web application for rapidly prototyping simulations. BioDynaMo Notebooks are powered by ROOT’s fast C++ interpreter and enables researchers to use BioDynaMo through the well-known Jupyter interface. Visit https://biodynamo.org/tutorials/ to try it.

Next steps

BioDynaMo is currently able to simulate millions of cells on one server. To improve the performance further, we will focus on two aspects. First, we will continue development on the distributed runtime, to combine the computational resources of many servers. Second, we will improve hardware acceleration to fully utilise (multiple) GPUs in a system. This will not only reduce runtime on high-end systems, but will also benefit users that work on a standard desktop or laptop.

In terms of epidemiological simulation, several virus-spreading scenarios will be investigated in 2021. This will help to determine which conditions are best for avoiding virus build-up and reducing infection. We will add the possibility to optimise model parameters based on various fitting functions, to match realistic scenarios and to extrapolate new findings. This automated optimisation engine will be executable on distributed computing platforms.

Publications

    L. Breitwieser, BioDynaMo: A New Platform for Large-Scale Biological Simulation (Master’s thesis), Graz University of Technology, Austria, 2016. cern.ch/go/z67t
    L. Breitwieser, R. Bauer, A. Di Meglio, L. Johard, M. Kaiser, M. Manca, M. Mazzara, F. Rademakers, M. Talanov, The BioDynaMo project: Creating a platform for large-scale reproducible biological simulations. CEUR Workshop Proceedings, 2016. cern.ch/go/Xv8l
    R. Bauer, L. Breitwieser, A. Di Meglio, L. Johard, M. Kaiser, M. Manca, M. Mazzara, F. Rademakers, M. Talanov, A. D. Tchitchigin, The BioDynaMo project: experience report. In Advanced Research on Biologically Inspired Cognitive Architectures (pp. 117-125). IGI Global, 2017. cern.ch/go/dp77
    A. Hesam, Faster than the Speed of Life: Accelerating Developmental Biology Simulations with GPUs and FPGAs (Master’s thesis), Delft University of Technology, Netherlands, 2018. cern.ch/go/f9v6
    J. de Montigny, A. Iosif, L. Breitwieser, M. Manca, R. Bauer, V. Vavourakis, An in silico hybrid continuum-/agent-based procedure to modelling cancer development: interrogating the interplay amongst glioma invasion, vascularity and necrosis. Methods, 2020. cern.ch/go/6vrm

Presentations

    K. Kanellis, Scaling a biological simulation platform to the cloud (15 August), Presented at CERN openlab summer students’ lightning talks, Geneva, 2017. cern.ch/go/d9nV
    L. Breitwieser, BioDynaMo (21 September), Presented at CERN openlab Open Day, Geneva, 2017. cern.ch/go/lNP9
    L. Breitwieser & A. Hesam, BioDynaMo: Biological simulation in the cloud (1 December), Presented at CERN IT technical forum, Geneva, 2017. cern.ch/go/m9Kw
    A. Hesam, Biodynamo project status and plans (11 January). Presented at CERN openlab Technical Workshop, Geneva, 2018. cern.ch/go/F8Cl
    L. Breitwieser, BioDynaMo (1 February). Presented at University Hospital of Geneva Café de l'Innovation, Geneva, 2018.
    L. Breitwieser, The Anticipated Challenges of Running Biomedical Simulations in the Cloud (12 February). Presented at Early Career Researchers in Medical Applications @ CERN, Geneva, 2018. cern.ch/go/spc8
    N. Nguyen, Distributed BioDynaMo (16 August). Presented at CERN openlab summer students' lightning talks, Geneva, 2018.
    A. Hesam, Faster than the Speed of Life: Accelerating Developmental Biology Simulations with GPUs and FPGAs (31 August). Master’s thesis defense, Delft, 2018. cern.ch/go/f9v6
    L. Breitwieser, The BioDynaMo Project: towards a platform for large-scale biological simulation (17 September). Presented at DIANA meeting, Geneva, 2018. cern.ch/go/kJv7
    L. Breitwieser, BioDynaMo (23 January). Presented at CERN openlab workshop, Geneva, 2019. cern.ch/go/7RlF
    R. Bauer, Computational modelling and simulation of biophysical dynamics in medicine (7 June). Presented at Big Data in Medicine: Challenges and Opportunities, Geneva, 2019. cern.ch/go/xf9F
    F. Rademakers, BioDynaMo (20 June). Presented at Hôpitaux Universitaires de Genève, Geneva, 2019.
    J. L. Jennings, Computational Modelling of cryopreservation using the BioDynaMo software package CryoDynaMo (22 July). Presented at the Society for Cryobiology Conference, San Diego, 2019.
    J. de Montigny, Computational modelling of retinal ganglion cell development (July). Presented at UK Neural Computation, Nottingham, 2019.
    G. De Toni, Improvements on BioDynaMo Build System (13 August). Presented at the CERN openlab summer student lightning talk session, Geneva, 2019. cern.ch/go/xt68
    G. De Toni, Improvements on BioDynaMo Build System (13 August). Presented at the CERN openlab summer student lightning talk session, Geneva, 2019. cern.ch/go/xt68
    L. Breitwieser, BioDynaMo Project Update (26 September). Presented at CERN Medical Application Project Forum, Geneva, 2019.
    A. Hesam, Simulation Master Class (26 September). Presented at CERN's Dutch Language Teachers Programme, Geneva, 2019. cern.ch/go/hKL7
    A. Hesam, Simulation Master Class (11 October). Presented at CERN's Dutch Language Students Programme (NVV Profielwerkstukreis), Geneva, 2019.
    L. Breitwieser, A. Hesam, The BioDynaMo Project (21 October). Presented at EmLife Meeting, Geneva, 2019.
    L. Breitwieser, The BioDynaMo Software Part I (2 December). Presented at the BioDynaMo Collaboration Meeting, Zurich, 2019.
    A. Hesam, The BioDynaMo Software Part II (2 December). Presented at the BioDynaMo Collaboration Meeting, Zurich, 2019.
    R. Bauer, BioDynaMo: A platform for computational models and simulations of biological systems, Presented at CERN Knowledge Exchange Event, Daresbury, 2019.
    L. Breitwieser, A. S. Hesam, The BioDynaMo Project (23 January). Presented at CERN openlab Technical Workshop, Geneva, 2020.

Smart platforms for science

Project goal

The project goal is to design a platform that can analyse data collected from user interactions and can process this information in order to provide recommendations and other insights, thus helping improve the performance and relevance of user searches or learning objectives.

R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio
Team members
Taghi Aliyev
Collaborator liaison(s)
Marco Manca (SCImPULSE), Mario Falchi (King’s College London)

Collaborators

Project background

Data-analysis systems often collect and process very different types of data. This includes not only the information explicitly entered by users (“I’m looking for…”), but also metadata about how the user interacts with the system and how their behaviour changes over time based on the results they get. Using techniques such as natural language processing (NLP) and smart chatbots, it is possible to achieve improved interaction between humans and machines, potentially providing personalised insights based on both general population trends and individual requests. Such a system would then be able to recommend further searches, actions, or links that may have not occurred to the user.

Such an approach could, for example, be used to design better self-help systems, automated first-level medical services, more contextual and objective-aware search results, or educational platforms that are able to suggest learning paths that address specific student needs.

This project is being carried out in the context of CERN's strategy for knowledge transfer to medical applications, led by CERN's Knowledge Transfer group.

Recent progress

The concept of the Smart Platforms project emerged in 2019 as a spin-off of the application of NLP techniques to genomic analysis in the GeneROOT project.

In 2019, a few initial discussions about possible applications were started in collaboration with educational institutes and public administrations, with the goal of developing the concept of smart chatbots that are able to improve human-machine interaction. As the project moved into the proof-of-concept phase, it became clear that the need to understand issues related to data-privacy and information sharing are still a critical roadblock for systems like this. The project was therefore merged into the CERN Living Lab, through which such concerns can be better addressed.

Next steps

The project has been merged into the CERN Living Lab as part of a general initiative to understand the implications of processing personal data and the related ethical constraints.

Publications

    A. Manafli, T. Aliyev: Natural Language Processing for Science. Information Retrieval and Question Answering. Summer Student Report, 2018. cern.ch/go/Z9l9

Presentations

    A. Di Meglio, Introduction to Multi-disciplinary Platforms for Science, (24 January). Presented at CERN openlab Technical Workshop, CERN, Geneva, 2019. cern.ch/go/XNt9
    T. Aliyev, Smart Data Analytics Platform for Science (1 November). Presented at i2b2 tranSMART Academic Users Group Meeting, Geneva, 2018.
    T. Aliyev, AI in Science and Healthcare: Known Unknowns and potential in Azerbaijan (December). Presented at Bakutel Azerbaijan Tech Talks Session, Baku, 2018.

Future technologies for medical Linacs (SmartLINAC)

Project goal

The ‘SmartLINAC’ project aims to create a platform for medical and scientific linear accelerators that will enable anomaly detection and maintenance planning. The goal is to drastically reduce related costs and unexpected breakdowns. The platform we develop will use artificial intelligence to adapt itself to different linear accelerators (Linacs) operated in all kinds of environments.

R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio
Team members
Yann Donon
Collaborator liaison(s)
Dmitriy Kirsh, Alexander Kupriyanov, Rustam Paringer, Igor Rystsarev

Collaborators

Project background

During a joint workshop held at CERN in 2017, involving the International Cancer Expert Corps and the UK Science and Technology Facilities Council, the need for simple-to-maintain-and-operate medical Linacs was emphasised strongly. Maintenance can be one of the main sources of expenditure related to Linacs; it is essential to reduce this cost in order to support the proliferation of such devices.

Following contacts with Samara National Research University in Russia in 2018, it was decided to create the SmartLINAC project. The university has a long history in the field of aerospace, which requires similar attention to fine detail and has led to the building up of expertise in big-data processing.

This project is being carried out in the context of CERN's strategy for knowledge transfer to medical applications, led by CERN's Knowledge Transfer group.

Recent progress

Following work to define the project’s scope in 2018, as well as an initial feasibility study, the main project got underway in 2019. For the first stages of development within the project, data has been used from the Linac4 accelerator at CERN. In particular, we have used data from the 2 MHz radio-frequency source that is used to create the plasma; this presents periods of ‘jitters’ that influence the beam’s quality.

By nature, these data sets are extremely noisy and volatile, leading to difficulties in interpretation and labelling. Therefore, the first research objective was to establish an appropriate data-labelling technique that would make it possible to identify ‘jittering’ periods. This has led to the creation of an anomaly detection system that recognises early symptoms in order to make preventive maintenance possible. Several approaches based on statistics and neural-network technologies were used to solve the problem. These approaches are now being combined in order to offer a system that can be adapted to different sources.

The data has been shown to be extremely difficult for neural networks to categorise. Rather than using neural networks to detect anomalies themselves, we have therefore made use of them to define appropriate parameters for a statistical treatment of the data source. This will, in turn, lead to detection of anomalies.

Next steps

A first solution is already trained to function in the radio-frequency source environment of Linac4. Therefore, the first objective of 2020 is to start its on-site implementation and to set up continuous field tests. The next challenge will then be to consolidate our parameter-selection model and to test the technique on multiple data sources.

 

 

Publications

    Y. Donon, Smart Anomaly Detection and Maintenance Planning Platform for Linear Accelerators (3 October). Presented at the 27th International Symposium Nuclear Electronics and Computing (NEC’2019), Montenegro, 2019. cern.ch/go/nb9z

Presentations

    Y. Donon, Smart Anomaly Detection and Maintenance Planning Platform for Linear Accelerators (3 October). Presented at the 27th International Symposium Nuclear Electronics and Computing (NEC’2019), Montenegro, 2019.
    Y. Donon, Anomaly detection in noised time series: the challenge of CERN’s LINAC4 (24 January). Presented at The Open Data science meetup #3, Samara, 2020. cern.ch/go/9PZD