UC Davis Bioinformatics Core
Resource:: UC Davis Using Galaxy for Analysis of High Throughput Sequence Data Workshop Types:: Presentations, Tutorials, Exercises, Datasets, AMI Domains:: Genomics: QC, Alignment, Variants, Assembly, RNA-Seq Owners:: UC Davis Bioinformatics Core Formats:: Entire workshop package Date:: 2014/06

This is a complete package of slides, examples, datasets, and an accompanying Amazon Machine Image.


The breadth of this course is staggering. All documentation is online:


  • Course, Bioinformatics Overview - lecture
  • High Throughput Sequencing Fundamentals - lecture
  • Introduction to Amazon Web Services EC2
  • Read Quality Assessment & Improvement - lecture
  • Illumina Read Quality Assessment & Improvement - exercises


  • Alignment & Variant Discovery - lecture
  • Next Generation Sequence Alignment - exercises
  • GATK Best Practices for Variant Discovery - lecture
  • Variant Discovery using GATK2
  • Effect Prediction and the UCSC Genome Browser
  • Useful links for GATK2 variant analysis and snpEff


  • Genome Assembly - lecture
  • Genome Assembly: then and now - guest lecture
  • Genome Assembly - exercises


  • RNA-Seq Concepts, Terminology, and Work Flows - lecture
  • Aligning SE RNA-Seq Reads to a Reference Genome - exercises
  • Aligning PE RNA-Seq Reads to a Genome - exercises
  • Differential Expression Analysis with edgeR from Genome Alignments - exercises
  • Gene Construction - lecture and exercises
  • Alignment to a Reference Transcriptome - exercises
  • RNA-Seq Statistics - lecture
  • Some Helpful Links


  • Odds & Ends (unanswered questions from earlier) - lecture
  • Challenge Problems
  • Considerations for Illumina library preparation - lecture
  • Challenge Solutions
  • Some useful genome data

See the workshop documentation home page for links.

Needed Tools

The easiest way to use this material is launch the workshop's accompanying Amazon Machine Image (AMI). It has all needed tools already installed on it.


The workshop AMI also includes the example datasets used in the exercises.