By working with communities beyond high-energy physics, we are able to ensure maximum relevancy for CERN openlab’s work, as well as learning and sharing both tools and best practices across scientific fields. Today, more and more research fields, such as medical research or space and Earth observation research, are driven by large quantities of data, and thus experience ICT challenges comparable to those at CERN. CERN openlab’s mission rests on three pillars: technological investigation, education, and dissemination. Collaborating with research communities and laboratories outside the high-energy physics community brings together all these aspects. Challenges related to the life sciences, medicine, astrophysics, and urban/environmental planning are all covered in this section, as well as scientific platforms designed to foster open collaboration.

 

Humanitarian AI applications for satellite imagery

 

Project goal

This project is making use of expertise in artificial intelligence (AI) technologies at CERN to support a UN agency. Specifically, we are working on AI approaches to help improve object recognition in the satellite imagery created to support humanitarian interventions. Such satellite imagery plays a vital role in helping humanitarian organisations plan and coordinate responses to natural disasters, population migrations, and conflicts.

R&D topic
Applications in other disciplines
Project coordinator(s)
Sofia Vallecorsa, Federico Carminati
Team members
Taghi Aliyev, Yoann Boget
Collaborator liaison(s)
Lars Bromley

Collaborators

Project background

Since 2002, CERN has hosted UNOSAT, the Operational Satellite Applications Programme of UNITAR (The United Nations institute for Training and Research) on the laboratory’s premises. UNOSAT acquires and processes satellite data to produce and deliver information, analysis, and observations to be used by the UN or national entities for emergency response, to assess the impact of a disaster or a conflict, or to plan sustainable development in the face of climate change.

At the heart of this project lies the idea of developing machine-learning techniques that can help speed up analysis of satellite imagery. For example, predicting and understanding the movement of displaced persons by identifying refugee shelters can be a long, labour-intensive task. This project is working to develop machine-learning techniques that could greatly reduce the amount of time needed to complete such tasks.

Recent progress

Refugee camps often consist of more than 10,000 shelters and may need to be re-analysed several times in order to understand their evolution. Manual analysis typically leads to very high-quality output, but is very time-consuming. We have therefore worked with region-based convolutional neural networks to improve detection of new shelters in refugee camps, taking into account prior knowledge regarding the positions of existing shelters. The results were promising and the data pipeline created by our summer students has now been adapted and put to use by the UNOSAT experts. The retrained model yielded an average precision/recall score of roughly 80% and reduced the time needed for the task by a factor of 200 in some areas.

More recently, we also addressed the challenge of simulating synthetic high-resolution satellite images. High-resolution satellite imagery is often licensed in such a way that makes it difficult to share it across UN partners and academic organizations. This reduces the amount of image data available to train deep-learning models, thus hampering research in this area. We have developed a generative adversarial network (GAN) that is capable of generating realistic satellite images of refugee camps images. Our tool was initially based on a progressive GAN approach developed by NVIDIA. We have now developed this further, such that it can combine multiple simulated images into a cohesive larger image of roughly 5 million pixels.

Several other lines of investigation — all related to AI technologies — are also being pursued within the scope of this project.

Next steps

Next year, we will pursue the initial work carried out on the GAN model in 2019 in a number of different directions. We will carry out a detailed performance study and will implement a distributed approach for parallel network training, as well optimising the use of computing resources. This should help us to reduce training time for the model and increase the maximum image size.

Publications

    N. Lacroix, T. Aliyev, L. Bromley: Automated Shelter Recognition in Refugee Camps. CERN openlab Summer Student Report. Published on ZENODO, 2019. http://cern.ch/go/v6rn

Presentations

    Y. Boget, ProGAN on Satellite images (15 August). Presented at CERN openlab summer student lightning talk session, Geneva, 2019. cern.ch/go/P6NV
    Y. Boget, S. Vallecorsa, Deep Learning for Satellite Imagery (24 September). Presented at IXPUG Annual Conference, Geneva, 2019. cern.ch/go/m9n6

Future technologies for medical Linacs (SmartLINAC)

Project goal

The ‘SmartLINAC’ project aims to create a platform for medical and scientific linear accelerators that will enable anomaly detection and maintenance planning. The goal is to drastically reduce related costs and unexpected breakdowns. The platform we develop will use artificial intelligence to adapt itself to different linear accelerators (Linacs) operated in all kinds of environments.

R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio
Team members
Yann Donon
Collaborator liaison(s)
Dmitriy Kirsh, Alexander Kupriyanov, Rustam Paringer, Igor Rystsarev

Collaborators

Project background

During a joint workshop held at CERN in 2017, involving the International Cancer Expert Corps and the UK Science and Technology Facilities Council, the need for simple-to-maintain-and-operate medical Linacs was emphasised strongly. Maintenance can be one of the main sources of expenditure related to Linacs; it is essential to reduce this cost in order to support the proliferation of such devices.

Following contacts with Samara National Research University in Russia in 2018, it was decided to create the SmartLINAC project. The university has a long history in the field of aerospace, which requires similar attention to fine detail and has led to the building up of expertise in big-data processing.

This project is being carried out in the context of CERN's strategy for knowledge transfer to medical applications, led by CERN's Knowledge Transfer group.

Recent progress

Following work to define the project’s scope in 2018, as well as an initial feasibility study, the main project got underway in 2019. For the first stages of development within the project, data has been used from the Linac 4 accelerator at CERN. In particular, we have used data from the 2 MHz radio-frequency source that is used to create the plasma; this presents periods of ‘jitters’ that influence the beam’s quality.

By nature, these data  sets are extremely noisy and volatile, leading to difficulties in interpretation and labelling. Therefore, the first research objective was to establish an appropriate data-labelling technique that would make it possible to identify ‘jittering’ periods. This has led to the creation of an anomaly detection system that recognises early symptoms in order to make preventive maintenance possible. Several approaches based on statistics and neural-network technologies were used to solve the problem. These approaches are now being combined in order to offer a system that can be adapted to different sources.

The data has been shown to be extremely difficult for neural networks to categorise. Rather than using neural networks to detect anomalies themselves, we have therefore made use of them to define appropriate parameters for a statistical treatment of the data source. This will, in turn, lead to detection of anomalies.

Next steps

A first solution is already trained to function in the radio-frequency source environment of Linac 4. Therefore, the first objective of 2020 is to start its on-site implementation and to set up continuous field tests. The next challenge will then be to consolidate our parameter-selection model and to test the technique on multiple data sources.

 

 

Publications

    Y. Donon, Smart Anomaly Detection and Maintenance Planning Platform for Linear Accelerators (3 October). Presented at the 27th International Symposium Nuclear Electronics and Computing (NEC’2019), Montenegro, 2019. http://cern.ch/go/nb9z

Presentations

    Y. Donon, Smart Anomaly Detection and Maintenance Planning Platform for Linear Accelerators (3 October). Presented at the 27th International Symposium Nuclear Electronics and Computing (NEC’2019), Montenegro, 2019.
    Y. Donon, Anomaly detection in noised time series: the challenge of CERN’s LINAC 4 (24 January). Presented at The Open Data science meetup #3, Samara, 2020. http://cern.ch/go/9PZD

Smart platforms for science

Project goal

We are developing a platform that will support a complete data-analysis life cycle, from data discovery through to access, processing, and end-user data analysis. The platform will be easy to use and will offer narrative interfaces.

As part of the development, we are working together with a number of teams at CERN on data integration and pipeline preservation. In particular, we are working closely with the teams behind REANA, a system for reusable analyses of research data, and Zenodo, open-access repository operated by CERN.

R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio
Team members
Taghi Aliyev
Collaborator liaison(s)
Mario Falchi

Collaborators

Project background

In many research communities today, reproducibility, communication, and data pipelines are implemented in suboptimal ways.Through this project — cofinanced by the CERN budget for knowledge transfer to medical applications — we are working to create a powerful system to capture and facilitate the habits of researchers. Our platform will allow for negotiation and sharing of common values among scientists within a given field and will help us to understand the reasoning behind why certain choices are made. Rather than providing a simple toolkit for researchers, we are creating a rich system through which researchers can challenge the value chains within their own respective fields and potentially enhance their approach to performing research.

Recent progress

Throughout 2018, we gathered and worked on a range of initial use cases for the platform. Contacts were established with companies like IBM and non-profit organisations like GEnIAl (a local initiative working to enhance the lives of citizens in Geneva). As part of our collaboration with the two named organisations, we are now deploying solutions and ideas developed through the project to help tackle everyday challenges related to information retrieval and the answering of questions.

As part of the work with the GEnIAl community, we are working on the implementation of chat bots that could be used by members of the public in the Canton of Geneva, and are part of the Responsive City Camp Geneva initiative. This initiative has been endorsed by the Canton of Geneva, as well as by many organisations in the region. Initial ideas and results are to be presented at the Applied Machine Learning Days conference on 28 January 2019 in Lausanne, Switzerland.

Next steps

In the coming year, we will mainly work to assess the effectiveness of the prototypes and implemented models. Based on the obtained results, we will then work to improve the platform further, before deploying the first product to the wider research community.

 

 

Publications

    A. Manafli, T. Aliyev: Natural Language Processing for Science. Information Retrieval and Question Answering. Summer Student Report, 2018. cern.ch/go/Z9l9

Presentations

    T. Aliyev, Smart Data Analytics Platform for Science (1 November). Presented at i2b2 tranSMART Academic Users Group Meeting, Geneva, 2018.
    T. Aliyev, AI in Science and Healthcare: Known Unknowns and potential in Azerbaijan (December). Presented at Bakutel Azerbaijan Tech Talks Session, Baku, 2018.

Living lab

Project goal

The project will develop a machine-learning platform for large-scale systems biology studies. This will serve as a proof-of-concept for federating heterogeneous data from diverse — and sometimes sensitive — sources. Tools for pseudo-anonymisation, ownership management, monitoring, and reporting will be investigated and integrated into the platform.

R&D topic
Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio
Collaborator liaison(s)
David Manset, Marco Manca

Collaborators

Project background

CERN is a living laboratory, with around 10,000 people coming to work at its main campuses every day. For operational purposes, CERN collects data related to health, safety, the environment, and other aspects of daily life at the lab. Creating a platform to collate and enable intelligent investigation of this data — while respecting privacy and other legal obligations — offers the potential to improve life at the lab.

This project is being carried out in the context of CERN's strategy for knowledge transfer to medical applications, led by CERN's Knowledge Transfer group.

Recent progress

Next steps

More information coming soon.