March 2016 Galaxy News
Welcome to the March 2016 Galactic News, a summary of what is going on in the Galaxy community.
If you have anything to include in the next News, please send it to Galaxy Outreach.
GCC2016 will be held June 25-29 at Indiana University in Bloomington, Indiana, United States. This will be the 7th annual gathering of the Galaxy community, and we are expecting over 200 participants again this year. The 2016 Galaxy Community Conference includes 2 days of hackathons, 2 days of training, and a two day meeting featuring accepted presentations, keynotes, poster sessions, the new Visualization Showcase and Software Demo sessions, lightning talks, birds-of-a-feather meetups, and plenty of networking.
Early registration for GCC2016 is now open. Registration costs depend on which events you register for, your career stage & affiliation, and when you register. Early bird registration ends May 20 and is up to 40% less than regular registration rates. Early bird registration starts at less than $45 / day for students and postdocs, and at $65 / day for other attendees from non-profits.
You can also sign up for conference housing during registration.
The deadline for oral presentation abstracts is March 25, 2016. Abstracts will be reviewed and submitters will be notified of acceptance status no later than April 11, 2016.
Poster and Demo abstract submission closes May 20, 2016 (or earlier depending on poster space availability). Poster and demo submissions are reviewed on a rolling basis, and submitters will be notified of acceptance status no later than two weeks after the abstract is submitted. You may submit similar content for oral presentations, posters, and demos.
Topics should be of interest to those working in high-throughput data analysis and research. Presentations that are Galaxy-centric are encouraged, but not required. Please see the abstracts page for full details.
We are pleased to offer scholarships for the 2016 Galaxy Community Conference, being held in Bloomington, Indiana, United States, June 25-29. Scholarships are available to students and post-docs in historically under-represented groups, and to those from or based in Low and Lower-Middle Income Economies, as defined by the World Bank. If this describes you or one of your students then we hope to receive an application.
Scholarships cover registration and lodging during the GCC Meeting, and for any Training or Hackathon events the applicant chooses to attend. Scholarships do not cover travel or other expenses. The application deadline is May 1 for members of historically underrepresented groups and March 20 for those from Low and Lower-Middle Income Economies.
See the full announcement for details.
We are pleased to announce that BioTeam will again be a sponsor for this annual event. BioTeam is a professional services and products company at the intersection of science and information technology. BioTeam solves analytics, big data and scientific computing challenges for organizations by mapping the right technologies to their unique scientific needs. BioTeam has a well-established history of providing forward-thinking solutions to the Life Sciences.
The BioTeam Appliance Galaxy Edition is a push-button solution that let’s researchers get up and running quickly with Galaxy. The Galaxy Appliance comes preinstalled with a production instance of Galaxy, bioinformatics tools, and reference datasets. This powerful system is specifically configured for computationally intensive scientific workloads. Most importantly, the Galaxy Appliance is an open system so researchers can install whatever tools they need and use the server as their own high-performance informatics infrastructure along side Galaxy. BioTeam provides ongoing support for the Galaxy Appliance, enabling researchers to minimize their IT burden. The Galaxy Appliance is used by researchers around the world for metagenomic, ChIP-Seq, RNA-Seq analysis and more. For more information: http://bioteam.net/bioteam-appliance/galaxy-edition/
This is the fourth year in a row that BioTeam has been a GCC Sponsor.
We continue to seek other sponsors as well and offer a wide range of sponsorship plans. If your organization is interested in having a presence at GCC2016, please contact the GCC2016 Exec for more information.
54 new papers referencing, using, extending, and implementing Galaxy were added to the Galaxy CiteULike Group in February, bringing the total number of papers to over 3000. Highlights from last month include:
MetaPalette: A $k$-mer painting approach for metagenomic taxonomic profiling and quantification of novel strain variation David Koslicki, Daniel Falush, arXiv:1602.05328 [q-bio.GN] (17 Feb 2016)
Colib’read on galaxy: a tools suite dedicated to biological information extraction from raw NGS reads by Yvan Le Bras, Olivier Collin, Cyril Monjeaud, Vincent Lacroix, Éric Rivals, Claire Lemaitre, Vincent Miele, Gustavo Sacomoto, Camille Marchet, Bastien Cazaux, Amal Zine El Aabidine, Leena Salmela, Susete Alves-Carvalho, Alexan Andrieux, Raluca Uricaru and Pierre Peterlongo, GigaScience, Vol. 5, No. 1. (11 February 2016), doi:10.1186/s13742-015-0105-2
Impact of soil heat on reassembly of bacterial communities in the rhizosphere microbiome and plant disease suppression Menno van der Voort, Marcel Kempenaar, Marc van Driel, Jos M. Raaijmakers, Rodrigo Mendes, Ecology Letters (January 2016), pp. n/a-n/a, doi:10.1111/ele.12567
RiboGalaxy: a browser based platform for the alignment, analysis and visualization of ribosome profiling data Audrey M. Michel, James P. A. Mullan, Vimalkumar Velayudhan, Patrick B. F. O'Connor, Claire A. Donohue, Pavel V. Baranov, RNA Biology (29 January 2016), pp. 00-00, doi:10.1080/15476286.2016.1141862
The new papers were tagged with:
The Galaxy CiteULike group surpassed 3000 pubs in February. 49% of them use Galaxy in their methods.
RADSeq is a cheap sequencing technology that is used by many resource-limited groups who would benefit a lot from easy-to-use galaxy tools. Indeed there has been quite some interest in analyzing RADSeq with Galaxy. Currently there is a wrapper for stacks and little more to help with RAD specific analysis (though many other galaxy tools are useful with RADs - bwa, cap3, gatk, velvet, ...).
We encourage ideas or advice about how to organize this so please let us know. A core group will be available on IRC all day and we will have google hangouts across those days to organize, answer questions, and report progress.
We will do our best to coordinate and make this hackathon a nice and productive experience and we would like to especially focus on working reasonable hours and discourage overnighters.
All forms of contribution are welcome!
The 2016 GMOD meeting will be held immediately following GCC2016 and in the same venue as GCC2016. http://gmod.org/wiki/|GMOD is a consortium of open-source software project, including Galaxy, that address common challenges with biological data. Other projects in the GMOD consortium include
- Tripal: A web front end for Chado databases. Galaxy is working with the Tripal project to make Galaxy be Tripal's analysis engine.
- JBrowse: A client-side genome browser and successor to the venerable http://gmod.org/wiki/GBrowse. JBrowse as a Galaxy Tool was presented by Eric Rasche at GCC2015. Ian Holmes, the JBrowse PI, has put JBrowse-Galaxy integration at "top of the list" for JBrowse's infrastructure upcoming infrastructure work.
- MAKER: A genome annotation pipeline that integrates several gene annotation engines, and combines them to produce annotation that is better than any individual tool produces. A MAKER-Galaxy by Agriculture and Agri-Food Canada was presented at ISMB 2014.
- InterMine and BioMart: These are both popular data sources that are integrated with Galaxy.
Early bird registration ends May 21. For those who would like to present a talk or poster, the meeting registration form includes a section for submitting the presentation title and abstract. Note: GCC2016 and GMOD have separate event registration, but are sharing housing - you won't even have to change rooms.
We are delighted to announce that Galaxy will be participating in Google Summer of Code this year, along with http://gmod.org/wiki/|GMOD and Reactome, as part of the Open Genome Informatics initiative.
The current list of project ideas is on this project ideas page. There is a bit more time to firm up and add new project ideas. It is also possible for students to suggest their own ideas and we will try hard to find them a mentor. If you add (or have) new ideas to the proposal, please let our GSoC coordinators Robin Haw, Scott Cain, and Marc Gillespie know.
Between now and 13 March, would-be student participants will start to discuss application ideas with this group. Please let Robin know if you have any questions about GSoC.
Slides and video from the February 2016 GalaxyAdmins meetup are now available.
|Designates a training event offered by GTN member(s)|
The Galaxy is expanding! Please help it grow.
- Bioinformatics Web Application Developer-Biology, Washington University in St. Louis, Missouri, United States
- Software developer and Post-docs, Gehlenborg Lab, Harvard Medical School, Boston, Massachusetts, United States
- Postdoctoral Research Positions, Molecular and Cellular Biology Department at Baylor College of Medicine, Houston, Texas, United States
- Bioinformaticist, Johns Hopkins University Applied Physics Laboratory (APL)
- Software Engineer, Oregon Health Sciences University, Portland, Oregon, United States
Three new publicly accessible Galaxy servers were listed in February:
- Taxonomic studies of environmental microbial communities
- Publishes the PipeAlign workflow and supporting tools. These tools include ballast, RASCAL, LEON, and MACSIMS.
- User Support:
- Supports anonymous access and creation of user logins.
- This instance of Galaxy is managed by the BISTRO platform.
- This server exposes the NacreousMap mapping/plotting tool of MiModD for users of MiModD who do not want to install the required dependencies (R and rpy2) for graphical output from this tool on their local system. MiModD is a comprehensive software package for the identification and annotation of mutations in the genomes of model organisms from whole-genome sequencing (WGS) data.
- CloudMap users can replot data produced with the Hawaiian Variant Mapping tool using the NacreousMap engine to obtain optimized (much smaller files that display faster) plots.
- To install the complete MiModD package for use as a command line tool or for integration into any local Galaxy follow the installation instructions in the MiModD User Guide.
- User Support:
- The quota for anonymous usage is 50MB, registered users have 250MB available.
- You can analyze/plot data in VCF format or the "Per variant report" format generated by local runs of the MiModD NacreousMap tool. The latter has the advantage of being up to 20 times smaller than the corresponding VCF file.
- See Impact of soil heat on reassembly of bacterial communities in the rhizosphere microbiome and plant disease suppression by van der Voort, et al., in Ecology Letters doi: 10.1111/ele.12567
- User Support:
- Anonymous access is supported.
Galaxy administrators should also be aware of the
The interactive tours framework allows developers and deployers to build interactive tutorials for users superimposed on the actual Galaxy web front end. Unlike video tutorials, these will not become stale and are truly interactive (allowing users to actually navigate and interact with Galaxy). Galaxy 16.01 ships with two example tours and new ones can easily be added by creating a small YAML file describing the tour. Try the Galaxy UI tour on Main.
Galaxy’s Python dependencies have traditionally been distributed as eggs using custom dependency management code to enable Galaxy to distribute binary dependencies (enabling quick downloads and minimal system requirements). With this release all of that infrastructure has been replaced with a modern Python infrastructure based on pip and wheels. Work done as part of this to enable binary dependencies on Linux has been included with the recently released pip 8.
Detailed documentation on these changes and their impact under a variety of Galaxy deployment scenarios can be found in the Galaxy Framework Dependencies section of the Admin documentation.
Workflows may now run other workflows as a single abstract step in the parent workflow. This allows for reusing or subworkflows in your analyses.
Update to latest stable release
Update to exact version
|See the Get Galaxy page for additional details regarding the source code locations.|
Barring a strong outcry from deployers, 16.01 will be the last release of Galaxy to support Python 2.6. For more information, see Galaxy Github Issue #1596.
This issue affects all known releases of Galaxy in at least the last 3 years. See the full announcement for details. ### Tool Shed Security Vulnerability - Read/write arbitrary filesystem paths Multiple security vulnerabilities were recently discovered in the Tool Shed that allow malicious actors to read and write files on the Tool Shed server outside of normal Tool Shed repository directories. This issue affects all known releases of Galaxy in at least the last 3 years. See the full announcment for details.
The Galaxy Team is proud to be part of the development team for a new cross-cloud library called CloudBridge. CloudBridge is a Python library providing a simple layer of abstraction over different cloud providers, reducing or eliminating the need to write conditional code for each cloud. The library is generally applicable to any domain wishing to run cloud-independent applications. There is already support for Amazon and OpenStack clouds with support for Google’s Compute Engine in development.
The first version of CloudBridge was released earlier this month and it comes with detailed user documentation: http://cloudbridge.readthedocs.org/. The source code is available on Github: https://github.com/gvlproject/cloudbridge.
Galaxy Docker Image 15.10
CloudMan is a cloud manager that orchestrates all of the steps required to provision a complete compute cluster environment on a cloud infrastructure; subsequently, it allows one to manage the cluster, all through a web browser.
This minor CloudMan 15.12 release includes:
- Update Galaxy to the Galaxy 15.10 release (see above for all the juicy details with the improvements that brings along)
- Preload the GIE IPython Docker container onto the image for faster startup: less than 30 seconds to a fully functional IPython notebook vs. several minutes before!
- For those of you that ran into the problem of the size of your user data overfilling after having added and removed a bunch of worker instances - no worries, that has been fixed now and resource tags are no longer stored in the cluster config to prevent persistent_data.yaml from becoming too big.
- The file system archive extraction has been made more resilient and synchronous.
- For those that are building custom images, a more direct method for locating the Nginx configuration directory has been added to help with different versions of the operating system (thanks to Matthew Ralston)
- A number of dependent library versions have been updated to their latest versions.
Starforge is a collection of scripts that supports the building of components for Galaxy. Specifically, with Starforge you can:
- Build Galaxy Tool Shed dependencies
- Build Python Wheels (e.g. for the Galaxy Wheels Server)
- Rebuild Debian or Ubuntu source packages (for modifications)
These things will be built in Docker. Additionally, wheels can be built in QEMU/KVM virtualized systems.
Documentation can be found at starforge.readthedocs.org.
Pulsar 0.6 was released in December. Pulsar is a Python server application that allows a Galaxy server to run jobs on remote systems (including Windows) without requiring a shared mounted file systems. Unlike traditional Galaxy job runners - input files, scripts, and config files may be transferred to the remote system, the job is executed, and the results are transferred back to the Galaxy server - eliminating the need for a shared file system.
The 0.6.x release includes these changes:
- Pulsar now depends on the new galaxy-lib Python package instead of manually synchronizing Python files across Pulsar and Galaxy.
- Numerous build and testing improvements.
- Fixed a documentation bug in the code (thanks to @erasche). e8814ae
- Remove galaxy.eggs stuff from Pulsar client (thanks to @natefoo). 00197f2
- Add new logo to README (thanks to @martenson). abbba40
- Implement an optional awknowledgement system on top of the message queue system (thanks to @natefoo). Pull Request 82 431088c
- Documentation fixes thanks to @remimarenco. Pull Request 78, Pull Request 80
- Fix project script bug introduced this cycle (thanks to @nsoranzo). 140a069
- Fix config.py on Windows (thanks to @ssorgatem). Pull Request 84
- Add a job manager for XSEDE jobs (thanks to @natefoo). 1017bc5
- Fix pip dependency installation (thanks to @afgane) Pull Request 73
BioBlend version 0.7.0 was released at the beginning of November. BioBlend is a python library for interacting with CloudMan and the Galaxy API. CloudMan offers an easy way to get a personal and completely functional instance of Galaxy in the cloud in just a few minutes, without any manual configuration.) From the release CHANGELOG.
blend4j v0.1.2 blend4j v0.1.2 was released in December 2014. blend4j is a JVM partial reimplemenation of the Python library bioblend for interacting with Galaxy, CloudMan, and BioCloudCentral.
|Share your training resources and experience now||Share your experience now||Describe your instance now|