May 2016 Galaxy News
Welcome to the May 2016 Galactic News, a summary of what is going on in the Galaxy community.
If you have anything to include in the next News, please send it to Galaxy Outreach.
GCC2016 will be held June 25-29 at Indiana University in Bloomington, Indiana, United States. This will be the 7th annual gathering of the Galaxy community, and we are expecting over 200 participants again this year. The 2016 Galaxy Community Conference includes 2 days of hackathons, 2 days of training, and a two day meeting featuring accepted presentations, keynotes, poster sessions, the new Visualization Showcase and Software Demo sessions, lightning talks, birds-of-a-feather meetups, and plenty of networking.
The deadline for poster and computer demo abstracts is May 20, or when we run out of space, whichever comes first. Abstracts are reviewed on a rolling basis and submitters are notified of acceptance status no later than two weeks after submission. You may submit similar content for oral presentations, posters, and demos.
Topics should be of interest to those working in high-throughput data analysis and research. Presentations that are Galaxy-centric are encouraged, but not required. Please see the abstracts page for full details.
(And we are also still accepting late oral abstracts, that will be considered if we have cancellations.)
Early registration for GCC2016 ends May 20. Registration costs depend on which events you register for, your career stage & affiliation, and when you register. Early bird registration ends May 20 and is up to 40% less than regular registration rates. Early bird registration starts at less than $45 / day for students and postdocs, and at $65 / day for other attendees from non-profits.
You can also sign up for conference housing during registration.
We are pleased to offer scholarships for the 2016 Galaxy Community Conference, being held in Bloomington, Indiana, United States, June 25-29. Scholarships are available to students and post-docs in historically under-represented groups, and to those from or based in Low and Lower-Middle Income Economies, as defined by the World Bank. If this describes you or one of your students then we hope to receive an application.
Scholarships cover registration and lodging during the GCC Meeting, and for any Training or Hackathon events the applicant chooses to attend. Scholarships do not cover travel or other expenses. The application deadline is May 1 for members of historically underrepresented groups.
See the full announcement for details.
We continue to seek other sponsors as well and offer a wide range of sponsorship plans. If your organization is interested in having a presence at GCC2016, please contact the GCC2016 Exec for more information.
Please welcome the journal GigaScience as a GCC Silver Sponsor for the 4th year in a row. GigaScience aims to revolutionize reproducibility of analyses, data dissemination, organization, understanding, and use.
All accepted oral presentations are eligible for consideration for publication in the journal GigaScience's Galaxy series. Published papers will receive a 15% discount in the article-processing charge if you flag GCC2016 on submission. As an open access and open-data journal focussing on reproducibility, GigaScience publishes all research objects (including data, software tools, workflows, VMs and containers) from 'big data' studies across the entire spectrum of life and biomedical sciences. GigaScience submissions utilize a novel format, where all of the supporting research objects are hosted and integrated into accepted papers using independently citable digital object identifiers from the journal's GigaGalaxy server and GigaDB database. See the Galaxy series page for examples of work coming from previous GCC meetings.
Please welcome GenomeWeb as a GCC Silver Sponsor for the third year in a row. GenomeWeb is an independent online news organization that provides in-depth coverage of the scientific and economic ecosystem spurred by high-throughput genome sequencing. We are the leading information source for scientists, executives, and clinicians who use and develop advanced life science tools.
EMC Emerging Technologies Division (ETD) is a global leader and trusted partner in Life Science storage solutions. We deliver powerful yet versatile solutions for healthcare and life science organizations that want to manage clinical and genomics data. ETD storage solutions are simple to install, manage and scale, at any size, across the R&D data lifecycle. As a leader and trusted partner at hundreds of Life Science organizations worldwide, ETD storage solutions provide the security, ease of management,high availability, and scalability needed to manage Life Science workflows today and in the future.
Galaxy will have a strong presence at the 64th ASMS Conference on Mass Spectrometry and Allied Topics, being held June 5-9 in San Antonio, Texas, United States. There will be one workshop and one talk (both from the GalaxyP Project), and at least 7 posters on using Galaxy for proteomics.
If you are interested, register now as early registration closes April 30.
The UC Davis Bioinformatics Training Program, a GTN member, will be presenting the workshop Using Galaxy for Analysis of RNA-Seq and ChIP-Seq Data on June 13-17, at UC Davis in Davis, California, United States.
This workshop will include a rich collection of lectures and hands-on sessions, covering both theory and tools. We will explore the basics of high throughput sequencing technologies, focusing on Illumina data for hands-on exercises. Participants will explore software and protocols, create and modify workflows, and diagnose/treat problematic data, utilizing computing power of the Amazon Cloud.
Space is limited and this workshop is already more than 50% full.
There are a staggering 14 known Galaxy related events and presentations in May. These are spread over 4 countries on 3 continents. June and July are filling up too.
See the Galaxy Events Google Calendar for details on other events of interest to the community.
|Designates a training event offered by GTN member(s)|
Slides and video from the April 2016 GalaxyAdmins meetup are now available. Ivar Grytten and Geir Kjetil Sandve from the University of Oslo discussed The Galaxy Portal: Accessing Galaxy from Mobile Devices (Slides) and John Chilton covered Tool Development Developments.
A Conda Dependencies Dodefest was held on Monday April 4, and involved 8 participants. It was designed to be beginner friendly, which increased contribution from the community. 4 members of the galaxy community were added as contributors to the bioconda-recipe repository as a result of this hackathon. The main aim of the codefest was to get community members familiar with the Conda-Galaxy integration, and to remove tools from testing blacklist. See the full codefest report for details.
Want your own Galaxy server, for free? You can now easily create Galaxy servers on the new NSF Jetstream cloud. Each server comes preconfigured with hundreds of tools and commonly used reference datasets. It only takes a couple of minutes to start one. Once running, you can use it or change it up any way you like.
How do I get access?
You must be a US-based academic to access Jetstream cloud. Access is free but it is necessary to have an XSEDE account (go to https://www.xsede.org/ to sign up) and have an active resource allocation. Getting the resource allocation is matter of writing a summary of your research in less than 100 words and waiting ~24 hrs for the application to get approved. Go to http://jetstream-cloud.org/allocations.php → "Submit and manage allocation requests" to get started; choose Startup type of allocation.
How do I launch my own Galaxy server?
After you have your XSEDE account and an active allocation:
- Visit https://use.jetstream-cloud.org/
- Browse the available images and choose "Galaxy 16.01 Standalone"
- Follow the prompts on the screen to launch an instance
- In less than 5 minutes, you should have your own, fully configured Galaxy server
More documentation about the process can be found here.
72 new papers referencing, using, extending, and implementing Galaxy were added to the Galaxy CiteULike Group in April.
Some April highlights:
Unlocking Large-Scale Genomics by Luca Pireddu
qpMerge: Merging different peptide isoforms using a motif centric strategy by Matthew M Hindle, Thierry Le Bihan, Johanna Krahmer, Sarah F Martin, Zeenat B Noordally, T. Ian Simpson, Andrew J. Millar, doi: https://doi.org/10.1101/047100
Metavisitor, a suite of Galaxy tools for simple and rapid detection and discovery of viruses in deep sequence data by Guillaume Carissimo, Marius van den Beek, Juliana Pegoraro, Kenneth D Vernick, Christophe Antoniewski, doi: https://doi.org/10.1101/048983
MutSpec: a Galaxy toolbox for streamlined analyses of somatic mutation spectra in human and mouse cancer genomes by Maude Ardin, Vincent Cahais, Xavier Castells, et al. BMC Bioinformatics, Vol. 17, No. 1. (18 April 2016), doi:10.1186/s12859-016-1011-z
deepTools2: a next generation web server for deep-sequencing data analysis by Fidel Ramírez, Devon P. Ryan, Björn Grüning, et al. Nucleic Acids Research (13 April 2016), gkw257, doi:10.1093/nar/gkw257
META-pipe - Pipeline Annotation, Analysis and Visualization of Marine Metagenomic Sequence Data by Espen M. Robertsen, Tim Kahlke, Inge A. Raknes, et al.
MGEScan: a Galaxy based system for identifying retrotransposons in genomes by Hyungro Lee, Minsu Lee, Wazim M. Ismail, et al. Bioinformatics (07 April 2016), btw157, doi:10.1093/bioinformatics/btw157
R-Syst::diatom: an open-access and curated barcode database for diatoms and freshwater monitoring by Frédéric Rimet, Philippe Chaumeil, François Keck, et al. Database, Vol. 2016 (1 January 2016), doi:10.1093/database/baw016
The new papers were tagged with:
There are two new comprehensive online tutorials from Anton Nekrutenko:
Variant calling is a complex field that was significantly propelled by advances in DNA sequencing and efforts of large scientific consortia such as the 1000 Genomes. This tutorial summarizes basic ideas central to Genotype and Variant calling.
This tutorial is inspired by an exceptional RNAseq course at the Weill Cornell Medical College compiled by Friederike Dündar, Luce Skrabanek, and Paul Zumbo and by tutorials produced by Björn Grüning (@bgruening) for Freiburg Galaxy instance. Much of Galaxy-related features described in this section have been developed by Björn Grüning (@bgruening) and configured by Dave Bouvier (@davebx).
The Galaxy is expanding! Please help it grow.
- Galaxy Administrator & Developer, University of Freiburg, Freiburg, Germany.
- Postdoctoral researcher, University of Freiburg, Freiburg, Germany.
- Bioinformatics Web Application Developer-Biology (Job ID 32899), Washington University in St. Louis, Missouri, United States
- Software developer and Post-docs, Gehlenborg Lab, Harvard Medical School, Boston, Massachusetts, United States
- Postdoctoral Research Positions, Molecular and Cellular Biology Department at Baylor College of Medicine, Houston, Texas, United States
- Software Engineer, Oregon Health Sciences University, Portland, Oregon, United States
There are two new publicly accessible Galaxy servers:
- Identifying long terminal repeats (LTR) and non-LTR retroelements in eukaryotic genomic sequences.
- ENA Browser or local storage is used to obtain input genome sequences including a traditional file upload. HMMER 3.1b1 is applied to gain speed boosts compared to a previous version HMMER 2+. In addition Generic Feature Format Version 3 is used for visualization of genome sequence data via a web-based genome browser e.g. UCSC Genome Browser or Ensembl Genome Browser.
- MGESCan is also accessible through Amazon Cloud (EC2), Galaxy Tool Shed or Published Workflow on the public galaxy server (usegalaxy.org)
- MGEScan: a Galaxy based system for identifying retrotransposons in genomes by Hyungro Lee1, Minsu Lee, Wazim Mohammed Ismail, Mina Rho, Geoffrey Fox, Sangyoon Oh, and Haixu Tang, Bioinformatics (2016) doi: 10.1093/bioinformatics/btw157
- User Support:
- MGEScan can be used anonymously or with a login. Anyone can create a login.
- User Support:
|Share your training resources and experience now||Share your experience now||Describe your instance now|
One new training resource was added in April:
Planemo is a set of command-line utilities to assist in building tools for the Galaxy project. April releases features these updates:
- Fix test summary report. Pull Request 429
- Improve error reporting when running shed_test. ce8e1be
- Improved code comments and tests for shed related functionality. 89674cb
- Rev galaxy-lib dependency to 16.4.1 to fix wget usage in newer versions of wget. d76b489
- Revert "check
.shed.ymlowner against credentials during shed creation", test was incorrect and preventing uploads. Pull Request 425, Issue 246
See the release history.
Pulsar 0.7 was released in April. Pulsar is a Python server application that allows a Galaxy server to run jobs on remote systems (including Windows) without requiring a shared mounted file systems. Unlike traditional Galaxy job runners - input files, scripts, and config files may be transferred to the remote system, the job is executed, and the results are transferred back to the Galaxy server - eliminating the need for a shared file system.
The January 2016 (v16.01) release of Galaxy features
- Interactive Tours
- Nested Workflows
See the announcement for full details.
Galaxy Docker Image 16.01
We just released an update to Galaxy CloudMan on AWS. CloudMan offers an easy way to get a personal and completely functional instance of Galaxy in the cloud in just a few minutes, without any manual configuration or imposed quotas. Once running, you have complete control over Galaxy, including the ability to install new tools.
Most notable changes include:
- Galaxy 16.01 release
- A fine-grained control over auto-scaling options
- Several fixes to cluster sharing and cloning
See the CHANGELOG for a more complete set of changes.
The Galaxy Team is proud to be part of the development team for a new cross-cloud library called CloudBridge. CloudBridge is a Python library providing a simple layer of abstraction over different cloud providers, reducing or eliminating the need to write conditional code for each cloud. The library is generally applicable to any domain wishing to run cloud-independent applications. There is already support for Amazon and OpenStack clouds with support for Google’s Compute Engine in development.
Starforge is a collection of scripts that supports the building of components for Galaxy. Specifically, with Starforge you can:
- Build Galaxy Tool Shed dependencies
- Build Python Wheels (e.g. for the Galaxy Wheels Server)
- Rebuild Debian or Ubuntu source packages (for modifications)
These things will be built in Docker. Additionally, wheels can be built in QEMU/KVM virtualized systems.
Documentation can be found at starforge.readthedocs.org.
BioBlend version 0.7.0 was released at the beginning of November. BioBlend is a python library for interacting with CloudMan and the Galaxy API. CloudMan offers an easy way to get a personal and completely functional instance of Galaxy in the cloud in just a few minutes, without any manual configuration.) From the release CHANGELOG.
blend4j v0.1.2 blend4j v0.1.2 was released in December 2014. blend4j is a JVM partial reimplemenation of the Python library bioblend for interacting with Galaxy, CloudMan, and BioCloudCentral.
Sorry. Ran out of time. Look for a double batch in the June News.