Project goal

This project is working to assess the capabilities of Oracle Autonomous Data Warehouse Cloud (ADWC) and Oracle Autonomous Analytics Cloud (AAC). These technologies are being tested for use in handling the masses of data that come from the control and monitoring systems in place for CERN’s accelerator complex. Specifically, our goal is to try to use these technologies to integrate different existing datasets, to improve the performance and efficiency for the most important and challenging data retrieval/analysis, and to unlock new possibilities for data exploration.

R&D topic
R&D Topic 3: Machine learning and data analytics
Project coordinator(s)
Eric Grancher and Eva Dafonte Perez
Technical team members
Manuel Martin Marquez, Sébastien Masson, Franck Pachot
Collaborator liaison(s)
Cemil Alper, Dimitry Dolgushin, David Ebert, Vincent Leocorbo, Pauline Maher, Cristobal Pedregal-Martin, Reiner Zimmermann

Collaborators

Project background

The LHC is one of the largest, most complex machines ever built. Keeping it — and the rest of the accelerator complex at CERN — running efficiently requires state-of-the-art control systems. More than 2.5 terabytes of monitoring data is generated per day, coming in from over a million signals spread across the accelerators and detectors. A complex “Industrial Internet of Things” (IIoT) system is in place to persist this data, making it possible for scientists and engineers to gain insights about temperatures, magnetic field strengths, beam intensities, and much more. This plays a vital role in ensuring the highest levels of operational efficiency.

The current system to persist, access, and analyse the controls and monitoring data is based on Oracle Database. Today, significant effort is dedicated to improving performance and coping with increasing demand — in terms of both data volume and analysis of bigger datasets.

Recent progress

We organised our work in 2018 into three phases. In the initial phase, we carried out a high-level feasibility study of ADWC and AAC, making sure the technology could manage the extreme demands of our IIoT systems and our complex analytics queries. In this phase, we also explored the flexibility of provisioning, as well as the ability of the technology to automate updates, backups, and patches.

The second phase was dedicated to the evaluation of various procedures for migrating the data from our current on-premises architectures to Oracle’s cloud services. In particular, we considered the complexity of the data format, partitioning, indexing, etc. This work made it possible for us to evaluate the initial workload and data-analysis performance on a representative subset of the data, helping us to gain insights into the advanced optimisation features of AAC. We were also able to use Oracle Hybrid Columnar Compression to reduce storage requirements to about a tenth of what they previously were, as well as reducing the requirement for full scans. Thus, the performance for data retrieval and analytics tasks was significantly improved. On top of this, the system offered transparent and automated access to Oracle’s “Exadata SmartScan” and “Exadata Storage Indexes” features. This reduced — or, in some cases, removed entirely — the dependency on indexes.

In the last phase, we also worked with AAC to offer seamless data analytics based on collaborative and interactive dashboards. Our most recent work focuses on elasticity and scalability. In particular, we are working to increase the data volume used to one terabyte and increase the complexity of the workloads and analysis.

Next steps

This will lead to a comparison between the Autonomous Database’s capacities and other databases platform including the current on-premises setup.


Presentations

    M. Martin Marquez, Boosting Complex IoT Analysis with Oracle Autonomous Data Warehouse Cloud (June). Presented at Oracle Global Leaders Meeting – EMEA, Budapest, 2018.
    E. Grancher, M. Martin Marquez, S. Masson, Boosting Complex IoT Analysis with Oracle Autonomous Data Warehouse Cloud (23 October). Presented at Oracle Openworld 2018, San Francisco, 2018. http://cern.ch/go/RBZ6
    E. Grancher, M. Martin Marquez, S. Masson, Managing one of the largest IoT Systems in the world (December). Presented at Oracle Global Leaders Meeting – EMEA, Sevilla, 2018.