← Back to covid19

Workflows

Overview


Here is the info to get you started quickly:

  • We have five workflows for different sequencing platforms (Illumina or Oxford Nanopore) and library preparation strategies (Ampliconic or Metatranscriptomic).
  • Wokflows can be used to analyze any number of samples.
  • Workflows can be used via graphical user interface right now on any of our global instances in EU (https://usegalaxy.eu), US (https://usegalaxy.org), or Australia (https://usegalaxy.org.au) as shown in this tutorial.
  • Workflows can be accessed programmatically by either submitting a list of accession numbers to our Request an analysis service or by configuring your own Galaxy to automatically trigger the analyses
  • We provide powerful computational infrastructure for data analysis supported by national supercomputing resources in the US, EU, and Australia.

Workflows for discovery of sequence variants


We developed a number of workflows for the analysis of SARS-CoV-2 sequencing data. The workflows are available from WorkflowHub in the EU and DockStore in the US. Workflows listed in the table below were specifically designed for identifying sequence variants in raw read datasets:

LinkWorkflowInputsOutputsAlignerCaller
WorkFlowHub
DockStore
Illumina ARTIC:
Variant analysis from ampliconic data produced with ARTIC protocol v1, v2, v3, or v4, or any alternative primer scheme.
ILL-AMP
1. Paired reads [fastqsanger]
2. SARS-CoV-2 reference [fasta]
3. Primer coordinates [bed]
4. Primer pairs table [tsv]
Variants [vcf]BWA MEMlofreq
WorkFlowHub
DockStore
Oxford Nanopore ARTIC:
Variant analysis from ampliconic data produced with ARTIC protocol v1, v2, v3, or v4, or any alternative primer scheme.
ONT-AMP
1. Reads [fastqsanger]
2. SARS-CoV-2 reference [fasta]
3. Primer coordinates [bed]
Variants [vcf]minimap2medaka
WorkFlowHub
DockStore
Illumina metatranscriptomic PE:
Variant analysis from metatranscriptomic data.
ILL-MT-PE
1. Paired reads [fastqsanger]
2. SARS-CoV-2 reference [fasta]
Variants [vcf]BWA MEMlofreq
WorkFlowHub
DockStore
Illumina metatranscriptomic SE:
Variant analysis from metatranscriptomic data.
ILL-MT-SE
1. Reads [fastqsanger]
2. SARS-CoV-2 reference [fasta]
Variants [vcf]BWA MEMlofreq
WorkFlowHub
DockStore
Report generation:
Generation of final variant analysis reports/plots.
REPORTING
1. Variants [vcf]
2. Gene name translation table [tsv]
Reports [tsv], overview [svg]--

vcf = variant call format, tsv = TAB-separated values, svg = scalable vector graphics, fastqsanger = fastq format with Sanger encoding of base quality values, bed = browser extensible format

The following tutorial explains how to import workflows into your Galaxy instance.

Which workflow do I use?


Each of the four variant calling workflows from the table above is designed to be used together with the reporting workflow. The table below shows which workflows to use depending on combination of library prep and sequencing platform:

↓ Library Prep / Platform1IlluminaONT
AmpliconicILL-AMP + REPORTINGONT-AMP + REPORTING
Metatranscriptomic(ILL-MT-PE or ILL-MT-SE) + REPORTING-2

1 - there is an increasing number of PacBio data. Our workflows can be easily adapted for these data as well. Use OPEN CHAT below to let us know. 2 - this conceptually is identical to ILL-MT-SE except for replacing the mapper with minimap2 and the variant caller with medaka

How do I use it and where do I run my analyses?


This depends on who you are. If you are:

You are a ...Where do you start ...
Biomedical researcherUse any of the three global Galaxy instances in EU (https://usegalaxy.eu), US (https://usegalaxy.org), or Australia (https://usegalaxy.org.au). Take a look at the following tutorial to begin: Mutation calling, viral genome reconstruction and lineage/clade assignment from SARS-CoV-2 sequencing data - a Galaxy Training Network Tutorial.
Bioinformatician or data scientistYou have two options:
  1. Option 1: Use our "Request an analysis" service to submit a list of datasets to us and trigger automated analyses.
  2. Option 2: Configuring your own Galaxy instance to automatically trigger the analyses. Use this option if you run your own Galaxy installation

These analysis capabilities are supported by public computational infrastructure provided by the XSEDE consortium in the US, the deNBI and ELIXIR consortia in the EU, and Nectar Cloud in Australia. The figure below illustrates current processing times (in EU) for analysis of SARS-CoV-2 data. You can see that most analyses complete within a 1-2 hour interval.