Paper Highlight: “Facilitating Reproducibility in Catalysis Research with Managed Workflows and RO-Crates: A Galaxy Case Study”
Members of the EuroScienceGateway project working on materials science use cases in the UK have co-authored a paper addressing reproducibility of catalysis experiments.
Within the field of catalysis research carried out using X-ray Absorption Spectroscopy (XAS), increasing data volumes, lack of provenance between inputs and outputs and missing metadata can all pose a challenge to reproducibility, even when data is included with a publication. As a case study, several examples of publications using data collected at the Diamond Light Source via the UK Catalysis Hub were identified. Using Galaxy tools developed and hosted as part of the EuroScienceGateway, the process described in each publication was repeated to determine whether the same results could be reproduced, and what benefits the Galaxy platform could provide. Other co-authors were also members of the Physical Sciences Data Infrastructure project.
Some of the main challenges identified were:
- Some publications included only an intermediary form of the data and not the raw values obtained from experiment
- Not having all the input parameters needed for each step of analysis process
- Different labelling of data in the final paper and the associated data object
All of this made it difficult to reproduce the exact process, and therefore the results, of the original paper. By using Galaxy to generate RO-Crates for each workflow invocation, it was possible to generate a single digital object which includes all inputs, outputs, parameters, and the links between them. We concluded that this could make both publishing of data and reproduction by others easier.