Project goal

We are developing a platform that will support a complete data-analysis life cycle, from data discovery through to access, processing, and end-user data analysis. The platform will be easy to use and will offer narrative interfaces.

As part of the development, we are working together with a number of teams at CERN on data integration and pipeline preservation. In particular, we are working closely with the teams behind REANA, a system for reusable analyses of research data, and Zenodo, open-access repository operated by CERN.

R&D topic
R&D Topic 4: Applications in other disciplines
Project coordinator(s)
Alberto Di Meglio
Technical team members
Taghi Aliyev
Collaborator liaison(s)
Mario Falchi

Collaborators

Project background

In many research communities today, reproducibility, communication, and data pipelines are implemented in suboptimal ways.Through this project — cofinanced by the CERN budget for knowledge transfer to medical applications — we are working to create a powerful system to capture and facilitate the habits of researchers. Our platform will allow for negotiation and sharing of common values among scientists within a given field and will help us to understand the reasoning behind why certain choices are made. Rather than providing a simple toolkit for researchers, we are creating a rich system through which researchers can challenge the value chains within their own respective fields and potentially enhance their approach to performing research.

Recent progress

Throughout 2018, we gathered and worked on a range of initial use cases for the platform. Contacts were established with companies like IBM and non-profit organisations like GEnIAl (a local initiative working to enhance the lives of citizens in Geneva). As part of our collaboration with the two named organisations, we are now deploying solutions and ideas developed through the project to help tackle everyday challenges related to information retrieval and the answering of questions.

As part of the work with the GEnIAl community, we are working on the implementation of chat bots that could be used by members of the public in the Canton of Geneva, and are part of the Responsive City Camp Geneva initiative. This initiative has been endorsed by the Canton of Geneva, as well as by many organisations in the region. Initial ideas and results are to be presented at the Applied Machine Learning Days conference on 28 January 2019 in Lausanne, Switzerland.

Next steps

In the coming year, we will mainly work to assess the effectiveness of the prototypes and implemented models. Based on the obtained results, we will then work to improve the platform further, before deploying the first product to the wider research community.

Publications

    A. Manafli, T. Aliyev: Natural Language Processing for Science. Information Retrieval and Question Answering. Summer Student Report, 2018. cern.ch/go/Z9l9

Presentations

    T. Aliyev, Smart Data Analytics Platform for Science (1 November). Presented at i2b2 tranSMART Academic Users Group Meeting, Geneva, 2018.
    T. Aliyev, AI in Science and Healthcare: Known Unknowns and potential in Azerbaijan (December). Presented at Bakutel Azerbaijan Tech Talks Session, Baku, 2018.