Computing Platforms for Teaching with Galaxy

There are many platform choices for training with Galaxy. This page discusses the different options including the benefits and drawbacks of each. First, some introductory concepts:

Teaching 'Using' versus 'Administering' Galaxy

The best choice of platform for teaching Galaxy is largely determined by whether the focus is on teaching using Galaxy or installing and managing Galaxy. Installing and managing Galaxy on cloud-based infrastructures is a viable option for biologists who want to use Galaxy. The Galaxy Project strongly encourages teaching users both how to use and how to administer Galaxy instances.

Teaching 'Using'

Teaching using means instructing users how to perform bioinformatic analyses with Galaxy. This type of teaching works best on shared servers with multiple participants on each server. Typically, participants are not given admin access to the servers. The following options are available for teaching using:

Teaching 'Administering'

Teaching administering means instructing users how to install and manage a Galaxy instance. This type of teaching works best if each participant sets up a personal server for which they are the only user and have admin access. The following options are available for teaching administering:

Shared versus Personal Servers

Shared

The term shared server describes Galaxy servers that are used by more than one participant simultaneously. Shared servers include Main, Shared Servers on the Cloud, Public Galaxy Servers, and Local Galaxy Servers. Shared serves are typically used to teach using Galaxy.

Personal

The term personal server describes Galaxy servers that are used by only one participant at a time. These users may also have admin access to the server. Personal servers are typically used to teach admininstering Galaxy.

Platform Options: Which Platform to Use?

Choosing a platform to teach Galaxy depends on the purpose of the teaching and what the teacher has access to. Platform options are described in detail below and are color coded according to their suitability to:

  • Light Use: simple analyses that do not require much compute power (e.g., text operations, interval operations)
  • Heavy Use: computationally intensive analyses (e.g., analyses of NGS datasets)
  • Admin: access to Galaxy internals and ability to control the installation (e.g., having administrative privileges)

The colors have the following meaning:

Not appropriate Suitable Recommended

Pre-existing Galaxy Servers

Many Galaxy servers already exist, and some of them may be useful for your teaching needs.

The Galaxy Project Main Public Server

Light Use Heavy Use Admin

Teaching with the free Galaxy Project Main public server (usegalaxy.org is not recommended).

Main is a seemingly obvious choice for teaching: it's free, robust, rich in tools and data, has generous quotas, and anyone can create a login. However, Main is also a very popular resource, and it is impossible to predict when it will be lightly or heavily loaded. Teaching with Main on days when it is heavily loaded is a frustrating experience for both teachers and students. People don't learn anything if most of their time is spent waiting on jobs in the queue.

Local Galaxy Servers

Light Use Heavy Use Admin

If your institution has its own local Galaxy server, this can be an ideal platform for teaching. Local instances often have specific tools and datasets that your researchers care about, and since it is not a shared global resource, server load is much more predictable.

However, a class full of users all running the same analyses will stress any server, and you are strongly advised to run any plans by your Galaxy admins well in advance.

Public Galaxy Servers

Light Use Heavy Use Admin

There are many publicly accessible Galaxy servers that may be appropriate for training purposes. Many of these servers have specialized and domain-specific tools, and many of them will not be as heavily and unpredictably used as as Main.

You are encouraged to explore the list of pubic Galaxy servers to see if one or more matches your needs. However, these are also shared public resources. If you want to use a public server for training, be sure to contact the server's support people well in advance.

Cloud-Based Galaxy Servers

CloudMan

This option describes cloud-based Galaxy servers that are created for teaching purposes. Some public and local Galaxy servers are actually cloud-based; however, instructors don't set these up. This section specifically describes cloud-based Galaxy servers that are set up by the instructor for teaching purposes.

CloudMan software abstracts much of the details of different Cloud infrastructures and provides a uniform graphical interface for managing cloud-based servers.

A Word on AWS

Amazon Web Services in Education Grants Program

The Galaxy Project uses Amazon Web Services (AWS)-based servers in its workshops. CloudMan makes it easy to set these up. The Galaxy Project has an AWS in Education grant that supports this work.

If you are considering using a cloud-based platform, you are encouraged to apply for the AWS in Education Grants Program. The application process is succinct and quick, and the grant text is limited to 4000 characters. It always helps to have an estimate for the funding included with the proposal (this calculator can be useful). It is also helpful to have a detailed proposal describing the uniqueness of the work, how AWS will be used, and how the resulting work with be shared publicly via papers, events, or public relations.

Shared Servers on the Cloud

Light Use Heavy Use Admin

Starting in early 2012, every using Galaxy workshop presented by the Galaxy Project uses cloud-based and CloudMan-managed shared Galaxy servers. These servers are set up before a workshop, participants create accounts and use the servers during the workshop, and then the servers are shut down after the workshop.

If you are teaching using Galaxy, then large, shared cloud-based servers work well. You (the instructor) set them up, and the participants don't even need to know they are using cloud-based servers.

If you use AWS-based and CloudMan-managed servers for your training, the Galaxy Project also provides a Galaxy Workshop AMI that comes preloaded with several examples that are used in project-run workshops.

Personal Servers with CloudMan

Light Use Heavy Use Admin

Participants can set up their own CloudMan-based Galaxy server. This can be done either as a way to learn how to use Galaxy, or as one way to learn how to administer Galaxy.

From the Command Line, using an AMI

Light Use Heavy Use Admin

Amazon Machine Images (AMIs) are a type of virtual machine image (see below) that runs on the AWS infrastructure. Using an AMI to teach Galaxy administering is a great option, and you can determine how much has already been done on the image, from a bare-bones Linux install to an image with the Galaxy source already cloned from GitHub.

Locally Installed Galaxy Servers

Users can be instructred to install and run their own Galaxy servers on their personal laptops. Galaxy is relatively easy to install and can be run on moderately powered laptops. Importantly, some CPU- or memory-intensive analyses will not run well on a laptop. Therefore, if you are considering this option, you will need to determine minimum compute requirements for your students.

Using the Laptop's OS

Light Use Heavy Use Admin

Users running MacOS or Linux can install Galaxy directly on their laptops. This approach is not always consistent as users' laptops will have a slightly different configuration and slightly different libraries and packages already installed.

Virtual Machine Images

Light Use Heavy Use Admin

A virtual machine image can be created that workshop participants can download and run on their laptops using a virtual machine image player such as VirtualBox. In this case, you can pre-install any needed tools and datasets. The students will still be running analysis on their laptops, and therefore nothing too CPU- or memory-intensive can be run.