GCC2021: Training

Nominate training topics now!

The 2021 Galaxy Community Conference (GCC2021) will be held July 5-12, 2021 in Ghent, Belgium. Like other years, the conference will feature training workshops throughout the event. The training topics that are offered are determined by you.

Nominate Training Topics

Topic nominations close 29 January.

We are soliciting topic nominations for training sessions at GCC2021. Whether you are a teacher nominating a topic you would like to teach at GCC, or a participant who would like to attend a workshop to learn more about certain topics, this form is for you!

For ideas for nominations, please look at the Galaxy Training Network (GTN) tutorials. See below for the topics that have already been nominated.

Nominated Topics

Training Topics are added here within a few days of being nominated.

Analysis Topics

Compute and analyze Essential Biodiversity Variables with PAMPA toolsuite

GTN Tutorial

How to use PAMPA in Galaxy to compute Essential Biodiversity Variables (EBV) from species abundance data and analyse it through generalized linear (mixed) models (GLM and GLMM). This toolsuite is made up of 5 tools enable you to process temporal series data that include at least year, location and species sampled along with abundance value and, finally, generate article-ready data products.

Deep Neural Networks Demystified

Give a brief history of Neural Networks (NN), describe various NN architectures (Feed-forward, Recurrent, and Convolutional), discuss NN applications (specially in Bioinformatics), and do a hands-on Galaxy lab where a NN is used to solve an example problem.

Two sessions: One for theory, and the other for hands-on Galaxy lab.

Intended audience: Researchers/practitioners who want to use NN in their research/tasks. Basic understanding of Linear Algebra and Calculus required.

Single-cell analysis: Clustering 3K PBMCs with Scanpy

GTN Tutorial

Single-cell RNA-seq analysis is a rapidly evolving field at the forefront of transcriptomic research, used in high-throughput developmental studies and rare transcript studies to examine cell heterogeneity within a populations of cells. The cellular resolution and genome wide scope make it possible to draw new conclusions that are not otherwise possible with bulk RNA-seq.

Two sessions: Possibly in combiniaton with a second tutorial, such as the 10X workflow

Variant Calling and Analysis

Introduction to variant calling, annotation and analysis in Galaxy. Suggested GTN tutorials:

16S Microbial Analysis with mothur

GTN Tutorial

Analysis of the 16S gene allows for the efficient profiling of microbial communities, without contamination from the host genome.

In this tutorial, we will cover the mothur MiSeq Standard Operating Procedure (SOP) step-by-step in Galaxy.

Climate Science: Functionally Assembled Terrestrial Ecosystem Simulator (FATES) with Galaxy Climate JupyterLab

GTN Tutorial

FATES is the “Functionally Assembled Terrestrial Ecosystem Simulator”. FATES needs what we call a “Host Land Model” (HLM) to run and in this tutorial we will be using the Community Land Model of the Community Terrestrial Systems Model (CLM-CTSM). FATES was derived from the CLM Ecosystem Demography model (CLM(ED)), which was documented in Taking off the training wheels: the properties of a dynamic vegetation model without climate envelopes, CLM4.5(ED) 2015. and this technical note was first published as an appendix to that paper. The FATES documentation will provide some more insight on FATES too.

Inroduction to Deep Learning

GTN Tutorial

Deep learning, a branch of artificial intelligence, provides a collection of learning methods to model data with complex architectures to perform different non-linear transformations of data. Using these transformations, patterns are recognised in large volumes of data and new data can be categorised using these patterns extracted on existing data. These patterns are learned by computational models devised using different architectures of neural networks. In the recent years, the neural network architectures such as convolutional, long short-term memory networks, deep belief networks have become increasingly popular as machine learning tools in the fields of computer vision, image analysis, bioinformatics, speech recognition, natural language processing and so on achieving state-of-the-art performance, sometimes exceeding human performance.

Regression in Machine Learning

GTN Tutorial

In this tutorial you will learn how to use Galaxy tools to solve regression problems. First, we will introduce the concept of regression briefly, and then examine linear regression, which models the relationship between a target variable and some explanatory variables (also known as independent variables). Next, we will discuss gradient boosting regression, an more advanced regressor model which can model nonlinear relationships between variables. Then, we will show how to visualize the results in each step. Finally, we will discuss how to train our models by finding the values of their parameters that minimize a cost function. We will work through a real problem to learn how the models and learning algorithms work.

DNA Methylation Analysis

GTN Tutorial

DNA methylation is a biological process by which methyl groups are added to the DNA molecule. Methylation can change the activity of a DNA segment without changing the sequence.

Introduction to Genomics and Galaxy

GTN Tutorial

This practical aims to familiarize you with the Galaxy user interface. It will teach you how to perform basic tasks such as importing data, running tools, working with histories, creating workflows, and sharing your work.

Mapping

GTN Tutorial

Sequencing produces a collection of sequences without genomic context. We do not know to which part of the genome the sequences correspond to. Mapping the reads of an experiment to a reference genome is a key step in modern genomic data analysis. With the mapping the reads are assigned to a specific location in the genome and insights like the expression level of genes can be gained.

Analyses of metagenomics data - The global picture

GTN Tutorial

In metagenomics, information about micro-organisms in an environment can be extracted with two main techniques:

  • Amplicon sequencing, which sequences only the rRNA or ribosomal DNA of organisms
  • Shotgun sequencing, which sequences full genomes of the micro-organisms in the environment

In this tutorial, we will introduce the two main types of analyses with their general principles and differences. For a more in-depth look at these analyses, we recommend our detailed tutorials on each analysis.

Gene Predictions with Galaxy

Admin / Development Topics

Galaxy Code Architecture

Related GTN Material

How is the Galaxy code structured? What do the various other projects related to Galaxy do? What happens when I start Galaxy?

Explore various aspects of the Galaxy codebase, understand the various top-level files and modules in Galaxy, understand how dependencies work in Galaxy's frontend and backend, and a whole lot more.

Reference Data with CVMFS

GTN Tutorial

The CernVM-FS is a distributed filesystem perfectly designed for sharing readonly data across the globe. Galaxy uses it for sharing things that a lot of Galaxy servers need. Namely:

  • Reference Data
    • Genome sequences for hundreds of useful species.
    • Indices for the genome sequences
    • Various bioinformatic tool indices for the available genomes
  • Tool containers
    • Singularity containers of everything stored in Biocontainers (A bioinformatic tool container repository.) You get these for free every time you build a Bioconda recipe/package for a tool.
  • Others too..

Running Jobs on Remote Resources with Pulsar

GTN Tutorial

Pulsar is the Galaxy Project’s remote job running system. It is a python server application that can accept jobs from a Galaxy server, submit them to a local resource and then send the results back to the originating Galaxy server. More details on Pulsar can be found at:

Transport of data, tool information and other metadata can be configured as a web application via a RESTful interface or using a message passing system such as RabbitMQ.

This tutorial will teach you how to use Pulsar to distribute a Galaxy server's workload to multiple compute resources.

Workflow execution on the command line

Introduction to running workflows using command line tools such as planemo

Possibly multiple sessions e.g. in conjunction with other workflow-related trainings, e.g. as a follow up to an introduction to using workflows in the UI.

Prerequisites: basic knowledge of Linux command line

Other

Interface between synthetic biology and Biosecurity