Today's research is increasingly driven by data, and the reproducibility, accessibility, interoperability and searchability of research data are becoming essential to the research process.
The Research Data Management (RDM) team of the Leibniz Supercomputing Centre of the Bavarian Academy of Sciences and Humanities (LRZ) is part of different projects on national and European level, and in particular of the EU (Horizon 2020) project “Large-scale EXecution for Industry & Society” (LEXIS).
The LEXIS project is building an advanced platform for running Scientific Computing Workflows at the confluence of High-Performance Computing (HPC), IaaS Cloud and Big Data solutions. This can, e.g., involve preprocessing of weather-station-data, assimilation of this data in a large-scale weather simulation, and then visualizing the weather model output and giving flash-flood warnings.
Tasks within the LEXIS Pilot Workflows are executed at LRZ and IT4I (Czech National Supercomputing Centre). The aim of the project is to freely distribute such workflows over the most appropriate computing resources at both centres.
Thus, we need a common data platform, which manages access, storage and transfer of input and output data within the workflows. This platform is called the “LEXIS Distributed Data Infrastructure” (DDI). While the DDI is currently based on iRODS and EUDAT solutions, we would like to experiment with competing data-management systems in order to figure out optimisation potentials. Also, we would like to explore optimum data-transfer paths within the DDI infrastructure and to LRZ and IT4I HPC/Cloud Resources. In this thesis, you will benchmark parts of the infrastructure, devise additional measurement concepts as needed, interpret the results, and devise design recommendations for LEXIS and future similar projects.
Outline of this work:
Prof. Dr. D. Kranzlmüller
Dauer der Bachelor-Arbeit: gemäß Studienordnung
Anzahl Bearbeiter: 1