Oracle cloud technologies for data analytics on industrial control systems

Project goal

CERN’s control systems acquire more than 250 TB of data per day from over two million signals from across the LHC and its experiments. Managing these extremely complex “Industrial Internet of Things” (IIoT) systems raises important challenges in terms of data management, retrieval, and analytics.

The project team is working with Oracle Autonomous Database and analytics technologies. The goal is to assess their capabilities in terms of integrating heterogeneous control IIoT data sets and improving performance and efficiency for the most challenging analytics requirements, while reducing operational costs.

 

R&D topic
Machine learning and data analytics
Project coordinator(s)
Manuel Martin Marquez
Team members
Manuel Martin Marquez, Sébastien Masson, Franck Pachot, Ludovico Caldara
Collaborator liaison(s)
Çetin Özbütün, Reiner Zimmermann, Michael Connaughton, Cristobal Pedregal-Martin, Engin Senel, Cemil Alper, Giuseppe Calabrese, David Ebert, Dmitrij Dolgušin

Collaborators

Project background

Keeping the LHC and the rest of the accelerator complex at CERN running efficiently requires state-of-the-art control systems. A complex IIOT system is in place to persist this data, making it possible for engineers to gain insights about temperatures, magnetic-field strengths, and more. This plays a vital role in ensuring the highest levels of operational efficiency.

The current system to persist, access, and analyse this data is based on Oracle Database. Today, significant effort is dedicated to improving performance and coping with increasing demand — in terms of data volume, analysis and exploration of bigger data sets.

Recent progress

During 2019, the team focused on three main aspects: (i) scaling data volumes, (ii) improving the efficiency of the potential solutions in terms of automatisations and reducing operational costs, and (iii) increasing data retrieval/analytics complexity using real-life scenarios.

We began migrating one of the largest and most complex control data sets to the object-storage system of Oracle Cloud Infrastructure (OCI). Due to the large volume of this data set (about 1 PB), different solutions — based on standard networks, the GÉANT network, and Oracle’s appliance-based data-transfer solution — were tested. At the same time, the team worked together with development and management teams at Oracle to define the best data-model strategy to reduce associated costs and improve efficiency. To achieve this, a hybrid model was put in place. This model emphasises the benefits of object storage in the following manner: transparent external tables, based on parquet files, are used for data that is infrequently accessed, whereas normal database tables are used for data that requires close to real-time responses. To assess this, once a representative amount of data was available, real-life data loads were captured and simulated on OCI’s Autonomous Database.

Next steps

In 2020, we will focus on finalising the migration of the historical data to OCI Object Storage, to make it available for Oracle Autonomous Database instances. This will require us to face challenges related to the task flow for Oracle’s appliance-based data-transfer solution and the network configuration for GÉANT. In addition, we will work on automating data ingestion. In parallel, we will constantly increase real analytics load and asses solutions for interactive data exploration based on Oracle Autonomous Database technologies.


Presentations

    E. Grancher, M. Martin, S. Masson, Research Analytics at Scale: CERN’s Experience with Oracle Cloud Solutions (16 January). Presented at Oracle OpenWorld 2019, London, 2019.
    A. Mendelsohn (Oracle), E. Grancher, M. Martin, Oracle Autonomous Database Keynote (16 January). Oracle OpenWorld 2019, London, 2019.
    M. Martin, J. Abel (Oracle), Enterprise Challenges and Outcomes (17 January). Presented at Oracle OpenWorld 2019, London, 2019.
    S. Masson, M. Martin, Managing one of the largest IoT systems in the world with Oracle Autonomous Technologies (18 September). Presented at Oracle OpenWorld 2019, San Francisco, 2019. cern.ch/go/SBc9
    D. Ebert (Oracle), M. Martin, A. Nappi, Advancing research with Oracle Cloud (17 September). Presented at Oracle OpenWorld 2019, San Francisco, 2019. cern.ch/go/9ZCg
    M. Martin, R. Zimmermann (Oracle), J. Otto (IDS GmbH), Oracle Autonomous Data Warehouse: Customer Panel (17 September). Presented at Oracle OpenWorld 2019, San Francisco, 2019. cern.ch/go/nm9B
    S. Masson, M. Martin, Oracle Autonomous Data Warehouse and CERN Accelerator Control Systems (25 November). Presented at Modern Cloud Day, Paris, 2019.
    M. Martin, M. Connaughton (Oracle), Big Data Analytics and the Large Hadron Collider (26 November). Presented at Oracle Digital Days 2019, Dublin, 2019.
    M. Martin, Big Data, AI and Machine Learning at CERN (27 November). Presented at Trinity College Dublin and ADAPT Center, Dublin, 2019.
    M. Martin, M. Connaughton (Oracle), Big Data Analytics and the Large Hadron Collider (27 November). Presented at the National Analytics Summit 2019, Dublin, 2019. cern.ch/go/CF9p
    M. Martin Marquez, Boosting Complex IoT Analysis with Oracle Autonomous Data Warehouse Cloud (June). Presented at Oracle Global Leaders Meeting – EMEA, Budapest, 2018.
    E. Grancher, M. Martin Marquez, S. Masson, Boosting Complex IoT Analysis with Oracle Autonomous Data Warehouse Cloud (23 October). Presented at Oracle Openworld 2018, San Francisco, 2018. cern.ch/go/RBZ6
    E. Grancher, M. Martin Marquez, S. Masson, Managing one of the largest IoT Systems in the world (December). Presented at Oracle Global Leaders Meeting – EMEA, Sevilla, 2018.